Optimal Stopping Algorithms for the Secretary Problem

I present two algorithms that determine when a secretary should be chosen to maximize the likelihood of choosing the highest-ranked secretary.

The secretary problem is as follows. There are nn secretaries, each with a positive valuation. A secretary xx is ranked higher than secretary yy if they have a higher valuation than secretary yy. A manager knows that there nn secretaries but must interview each secretary sequentially. The order in which the manager interviews secretaries is random. During each interview, the manager either accepts or rejects the secretary. If the manager accepts the secretary, then the manager does not interview any more secretaries. If the manager rejects the secretary, he may continue to interview more secretaries. The manager’s goal is to select the highest-ranked secretary.

Constant Stopping Algorithm

In this section, I present an algorithm that determines when to stop learning in order to choose the highest-ranked secretary. One possible way to try to select the highest-ranked secretary is to observe the valuation of the first kk secretaries, and then if a secretary in the next nkn-k secretaries has a higher valuation than any of the secretaries in the first kk secretaries, accept that secretary. This is what the algorithm below states.

Algorithm. Reject k Secretaries\textbf{Algorithm. Reject \textit{k} Secretaries}

  1. Reject the first kk secretaries and track the highest-ranked secretary.
  2. For the last nkn-k secretaries, if a secretary is ranked higher than the highest-ranked secretary in the first kk secretaries, then accept the secretary.

Proposition.\textbf{Proposition.}
The algorithm reject first kk secretaries selects the highest-ranked secretary with 1n(1+kk+1+kk+2+\frac{1}{n}\left(1+\frac{k}{k+1}+\frac{k}{k+2}+\right. +kn1)\left.\cdots+\frac{k}{n-1}\right) probability.

Proof\textit{Proof}. Consider the law of total probability. The probability that the highest-ranked secretary is in any position is 1/n1 / n. If the highest-ranked secretary is one of the first kk secretaries, then they are rejected. If the highest-ranked secretary is in position k+1k+1, then they are selected with probability 11. Likewise, if they are in position k+2k+2, then they are selected with probability 11k+1=kk+11-\frac{1}{k+1}=\frac{k}{k+1}. This is because the probability of selecting the highest-ranked secretary is equal to the probability that the highest-ranked secretary among the first k+1k+1 secretaries appears in the first kk positions. This continues until the highest-ranked secretary is in position nn. In this case, they are selected with probability kn1\frac{k}{n-1}. Hence, by the total law of probability, the statement is true.

Proposition.\textbf{Proposition.}
Rejecting approximately the first n/en/e secretaries yields the highest probability of selecting the highest-ranked secretary.

Proof\textit{Proof}. Consider maximizing the probability of selecting the highest-ranked secretary. The probability 1n(1+kk+1+kk+2++kn1)=kn(i=kn11i)\frac{1}{n}\left(1+\frac{k}{k+1}+\frac{k}{k+2}+\cdots+\frac{k}{n-1}\right)=\frac{k}{n}\left(\sum_{i=k}^{n-1} \frac{1}{i}\right) is approximately equal to kn(ln(n1k1))\frac{k}{n}\left(\ln \left(\frac{n-1}{k-1}\right)\right) which is approximately equal to kn(ln(nk))\frac{k}{n}\left(\ln \left(\frac{n}{k}\right)\right) for large kk and nn. This is maximized when the derivative with respect to kk is equal to 00. Hence, 1n(ln(nk)1)=0k=n/e\frac{1}{n}\left(\ln \left(\frac{n}{k}\right)-1\right)=0 \Leftrightarrow k=n / e.

Dynamic Programming Stopping Algorithm

Determining when to stop or accept a secretary can also be found by using methods from dynamic programming. The optimal stopping policy is when the expected absolute ranking of the secretary at some stage is greater than or equal to the expected absolute ranking of future secretaries. In order to present this result, I define some important notation.

Suppose that ss is the cardinality of a subset of the nn secretaries. The relative rank of the ii th secretary out of ss secretaries is a number from 11 to ss. The absolute rank of the ii th secretary out of nn secretaries is a number from 11 to nn. Let the random variable XiX_i be the relative rank of the ii th secretary among the first ii secretaries. Assume that the variables XiX_i are identically, independently, and uniformly distributed.

Lemma.\textbf{Lemma.}
The probability that a secretary of relative rank xx among the first ss secretaries has absolute rank bb is given by the distribution

f(bs,x)=(b1x1)(nbsx)(ns) f(b \mid s, x)=\frac{\binom{b-1}{x-1}\binom{n-b}{s-x}}{\binom{n}{s}}

for all b=x,,ns+xb=x, \ldots, n-s+x.

Similar to the problem above, I consider only the conditional probability of when the secretary’s absolute rank is 11, given their relative rank. In this case, the above lemma implies that f(1s,x)=f(1 \mid s, x)= sn\frac{s}{n} if x=1x=1 and 00 if x1x \neq 1.

The dynamic programming problem can be formulated as

vs(x)=max{f(1s,x),1s+1x=1s+1vs+1(x)} v_s(x)=\max \left\{f(1 \mid s, x), \frac{1}{s+1} \sum_{x=1}^{s+1} v_{s+1}(x)\right\}

such that vn(x)=1v_n(x)=1 if x=1x=1 and 00 otherwise. The first part of the max\max function, i.e. f(1s,x)f(1 \mid s, x) represents the conditional expected absolute ranking of the ss secretary. The second part of the max function represents the expected absolute value of future secretaries, where values closer to zero represent a future secretary with an absolute ranking of one is unlikely. Values closer to one represent a secretary with an absolute ranking of one is likely. Consider the following policy to decide when to stop.

Algorithm. A Stopping Policy\textbf{Algorithm. A Stopping Policy}

  1. If at secretary s{1,,n1}s \in\{1, \ldots, n-1\}, f(1s,x)1s+1x=1s+1vs+1(x)f(1 \mid s, x) \geq \frac{1}{s+1} \sum_{x=1}^{s+1} v_{s+1}(x), then choose secretary ss and stop. Otherwise, continue.

  2. If no secretaries from 1,,n11, \ldots, n-1 is chosen, then choose secretary nn.

Proposition.\textbf{Proposition.}
For some m(n)m(n), the policy selects the highest-ranked secretary with probability 1n(1+m1m++m1n2+m1n1)\frac{1}{n}\left(1+\frac{m-1}{m}+\cdots+\frac{m-1}{n-2}+\frac{m-1}{n-1}\right).

Proof.\textit{Proof.} Use the law of total probability. The probability that the policy selects the highest-ranked secretary is the sum of the probabilities that it selects the highest-ranked secretary at position i=1,,ni=1, \ldots, n. The probability of selecting the highest-ranked secretary at position ii is equal to the probability that the highest-ranked secretary is at position ii, given that ii is selected and the probability that ii is selected. For i=1,,n1i=1, \ldots, n-1, the conditional probability is equal to f(1i,1)=inf(1 \mid i, 1)=\frac{i}{n}. For secretary i=ni=n, the probability is equal to 1n\frac{1}{n}. Further, the lemma below describes the probability of selecting a secretary. Hence, for all s>ms>m, define

h(m,s):=1si=ms1i1i=1sm1s1. h(m, s):=\frac{1}{s} \prod_{i=m}^{s-1} \frac{i-1}{i}=\frac{1}{s} \frac{m-1}{s-1}.

The probability that the policy selects the highest-ranked secretary is

1m(mn)+h(m,m+1)m+1n++h(m,n1)n1n+m1n11n=1n+m1m1n++m1n21n+m1n11n=1n(1+m1m++m1n2+m1n1). \begin{array}{l} \frac{1}{m}\left(\frac{m}{n}\right)+h(m, m+1) \frac{m+1}{n}+\cdots+h(m, n-1) \frac{n-1}{n}+\frac{m-1}{n-1} \frac{1}{n} \\ =\frac{1}{n}+\frac{m-1}{m} \frac{1}{n}+\cdots+\frac{m-1}{n-2} \frac{1}{n}+\frac{m-1}{n-1} \frac{1}{n} \\ =\frac{1}{n}\left(1+\frac{m-1}{m}+\cdots+\frac{m-1}{n-2}+\frac{m-1}{n-1}\right). \end{array}

The result here is incredibly similar to the probability that the reject first kk secretaries algorithm finds the highest-ranked secretary. It can also be shown that under the provided stopping policy, mm is approximately ne+1\frac{n}{e} + 1.

Lemma.\textbf{Lemma.}
For some m(n)m(n), the policy selects secretary mm with probability 1m\frac{1}{m}, secretary m+1m+1 with probability h(m,m+1)h(m, m+1), secretary m+2m+2 with probability h(m,m+2),h(m, m+2), \ldots, and secretary n1n-1 with probability h(m,n1)h(m, n-1). Secretary nn is selected with probability m1n1\frac{m-1}{n-1}. For all secretary l<ml<m, the policy selects secretary ll with probability 00.

Proof.\textit{Proof.} At any specific secretary s{1,,n}s \in\{1, \ldots, n\}, the policy selects secretary ss with probability

P(f(1s,Xs)1s+1x=1s+1vs+1(x))={1s if sm0 if s<m. P\left(f\left(1 \mid s, X_s\right) \geq \frac{1}{s+1} \sum_{x=1}^{s+1} v_{s+1}(x)\right)=\left\{\begin{array}{l} \frac{1}{s} \text { if } s \geq m \\ 0 \text { if } s<m \end{array}.\right.

The probability is positive only when f(1m,x)>0f(1 \mid m, x)>0, which occurs when x=1x=1. The probability of the relative rank being one out of mm ranks is 1/m1 / m. Further, if f(1m,x)>0f(1 \mid m, x)>0 for some mm, then f(1m+1,x)>0f(1 \mid m+1, x)>0 if x=1x=1. Hence, secretary mm is selected with probability 1/m1 / m, and a subsequent secretary pp is selected with the probability that secretaries mm to p1p-1 are not selected and pp is selected. Secretary nn is selected if no other previous secretaries were selected, which is

(11m)(11m+1)(11n2)(1n1)=m1n1. \left(1-\frac{1}{m}\right)\left(1-\frac{1}{m+1}\right) \ldots\left(1-\frac{1}{n-2}\right)\left(\frac{1}{n-1}\right)=\frac{m-1}{n-1}.