If you are asked to “find the k-th largest element in an unsorted array of length n” at an interview

Standard

First, think twice about this problems. For a general question like this, it can be shaped into different questions which might have different optimal solutions. Therefore, clarify things with the interviewer if needed. Following are some directions you can think:

  • About the elements in the given array

Is this an integer array?

Are there duplicate elements?

Are the elements sparse? (Will be explained below)

  • About what to return

If array contains duplicated numbers, how is the “k-th largest element” defined? Does it imply “finding the k-th distinct largest element” (e.g. the 2nd largest element in {1, 5, 5, 5, 3, 3} is 3 instead of 5)?

Do we need to maintain the order of the array while returning?

  • About k, n, and the relationship between them

Is n very large?

Is k a relatively small number as compared to n?

Is k a function of n?

What if k is larger than n/What if the k-th largest element does not exist (e.g. finding the fourth largest element in {1, 2, 3} or {1, 1, 1})?

  • About the algorithm complexity analysis

What’s the best solutions you can come up with in turns of time or space?

Are there any constraints on time/space?

Is there a priority in the solution (e.g. fastest performance, least space, etc)?

What next? Start with some simple solution and work on refining them later. This not only saves yourself from stressful thinking, but also helps you think out loud and show the interviewers your analysis capability.

The solutions you can think of probably are in these 2 directions: A) sort the array and then take k th largest element, B) select first k largest element one by one. From easy to hard, following is the list of possible solutions.

  1. A) Sorting based algorithms
  • Sort the array using one of the O(n^2) algorithms (e.g. bubble/insertion/selection sort) then return the k-th largest element.
  • Sort the array using one of the O(nlogn) algorithms (e.g. merge/quicksort/heap sort) then return the k-th largest element.
  • If this is an integer array (or, the number of digits is a constant instead of a function of n), we can sort the array using the radix/counting/bucket algorithm (depending on sparsity/duplication) with time complexity O(n) then return the k-th largest element.
  • If k is relatively small as compared to n, instead of sorting the entire array, we could use a modified selection sort (i.e., perform a selection sort but stop after obtaining the largest k elements) with O(nk).
  1. B) Selection based algorithms
  • Select largest element k times, the maximum value in k-th iteration would be the k-th largest element. Time complexity is then O(kn).

Finally, some test cases:

  • An array of many duplicates, e.g. {1, 1, 1, 5, 5, 5, 3, 3, 3, …}
  • An array with duplicated pivots, e.g. finding the 4th largest element in {1, 2, 3, 4, 4, 4, 5, 6, 7}
  • An array with a single value, e.g. {5, 5, 5, 5, 5, …}
  • A sorted array (ascending and descending)
  • An array with the pivot as the first/last element
  • In real world implementations, test k and n whenever necessary, make the function robust to illegal arguments

Notes

A similar problem is to “find the top k elements in an unsorted integer array of length n”. Solving the first problem essentially solves the second one – one is to find the pivot, and the other is to partition the array based on the pivot. However, if there are duplicates in the array, the first problem may be harder to optimize in some cases. Consider “finding the top 90 elements in an array of length 100” vs “finding the 90th largest element in an array of length 100”. When there are duplicates in the array, the former can be translated into “finding the bottom 10 elements”, whereas the latter cannot be translated into “finding the 10th smallest element” .

We are assuming that when there are duplicates, “finding the k-th largest element” implies “finding the k-th distinct largest element”, e.g. the 2nd largest element in {1, 3, 3, 5, 5, 5} is 3 instead of 5.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s