Computing Convex Hull in Python

Geometric algorithms involve questions that would be simple to solve by a human looking at a chart, but are complex because there needs to be an automated process.

One example is: given four points on a 2-dimensional plane, and the first three of the points create a triangle, determine if the fourth point lies inside or outside the triangle.

Another geometric problem is: given a number of points on a 2-dimensional plane, compute the minimum number of boundary points, that if connected, would contain all the points without creating a concave angle.

Here is one of the solutions I generated in Python:

convex hull plot

So how do you do it?

I got a clue from a lecture. It involves using a point as a pivot and determining which of two other points are the most clockwise from each other.

Before I watched more of the lecture, I was determined to figure out an algorithm that would solve it in a reasonable amount of time.

My idea was:

  1. sort the points from left to right (least value of x to largest) - O(n log n) where n is the number of (x, y) points
  2. starting with the leftmost point p:
    1. go through each point to the right of that point, and using p as a pivot, find which point is the most clockwise. O(n)
    2. set the most clockwise point as the new p - O(1)
    3. loop again with new p
  3. this continues until the starting point is reached O(h) - where h is the number of hull points

In order to "prematurely optimize" (I know it's bad) I was trying to make the all the comparisons only on points to the right of p, but then I would need to flip and go the other way once the max x value was reached.

It was turning out to be way more complicated than it should be. I was trying to get it from O(n2) down to O(n log n) but really all my optimizations were just making it O((n log n) + (n * h)).

I ended up cleaning it up and just getting the algorithm where it was correct, not fast. Then once it was correct, I would make it faster.

So I tore out a bunch of code and just got it working. And it worked beautifully.

I got rid of all the code that figured out if comparison points were to the right of the pivot point. They didn't help improve the complexity.

I was able to remove the sort, also. It wasn't needed. I could find my start point, the minimum x-value point, in linear time. It didn't matter what order the comparison points were in, since I was keeping track of the maximum clockwise-dness as I went along, the same as a linear search for the maximum value in an unsorted array.

This was the new way:

  1. Find the minimum x-value point, the initial point p - O(n)
  2. Using p as a pivot:
    1. find which other point is the most clockwise - O(n)
    2. set it as new p - O(1)

I ended up with h pivot points, each comparing its n neighbors to the one with the maximum clockwise angle.

This is O(n * h).

So I watched the rest of the lecture and it turns out my algorithm was one of the 2 solutions. It's called the Jarvis march, aka "the gift-wrapping algorithm", published in 1973.

from collections import namedtuple  
import matplotlib.pyplot as plt  
import random

Point = namedtuple('Point', 'x y')


class ConvexHull(object):  
    _points = []
    _hull_points = []

    def __init__(self):
        pass

    def add(self, point):
        self._points.append(point)

    def _get_orientation(self, origin, p1, p2):
        '''
        Returns the orientation of the Point p1 with regards to Point p2 using origin.
        Negative if p1 is clockwise of p2.
        :param p1:
        :param p2:
        :return: integer
        '''
        difference = (
            ((p2.x - origin.x) * (p1.y - origin.y))
            - ((p1.x - origin.x) * (p2.y - origin.y))
        )

        return difference

    def compute_hull(self):
        '''
        Computes the points that make up the convex hull.
        :return:
        '''
        points = self._points

        # get leftmost point
        start = points[0]
        min_x = start.x
        for p in points[1:]:
            if p.x < min_x:
                min_x = p.x
                start = p

        point = start
        self._hull_points.append(start)

        far_point = None
        while far_point is not start:

            # get the first point (initial max) to use to compare with others
            p1 = None
            for p in points:
                if p is point:
                    continue
                else:
                    p1 = p
                    break

            far_point = p1

            for p2 in points:
                # ensure we aren't comparing to self or pivot point
                if p2 is point or p2 is p1:
                    continue
                else:
                    direction = self._get_orientation(point, far_point, p2)
                    if direction > 0:
                        far_point = p2

            self._hull_points.append(far_point)
            point = far_point

    def get_hull_points(self):
        if self._points and not self._hull_points:
            self.compute_hull()

        return self._hull_points

    def display(self):
        # all points
        x = [p.x for p in self._points]
        y = [p.y for p in self._points]
        plt.plot(x, y, marker='D', linestyle='None')

        # hull points
        hx = [p.x for p in self._hull_points]
        hy = [p.y for p in self._hull_points]
        plt.plot(hx, hy)

        plt.title('Convex Hull')
        plt.show()


def main():  
    ch = ConvexHull()
    for _ in range(50):
        ch.add(Point(random.randint(-100, 100), random.randint(-100, 100)))

    print("Points on hull:", ch.get_hull_points())
    ch.display()


if __name__ == '__main__':  
    main()

The other algorithm, at O(n log n), uses a sort and then a simple single pass of all the points, and making only left turns as it goes around the perimeter counter-clockwise. When the next point is a right turn, it backtracks past all points (using a stack and popping points off) until that turn turns into a left turn. This algorithm is called the Graham scan.

Which algorithm is better? It depends on your points. If most of the points will lie on the hull, the n log n algorithm will be better. If you have relatively few hull points bounding most of the points, the n*h will be better. You could always plot a random sample of the points on a graph and then choose your algorithm from there. I think most points that resemble randomness will benefit from the Jarvis march.

comments powered by Disqus