Counting the number of people in a line outside is a challenge and there are a few steps to ensure an accurate prediction. First, the raw camera feed from Shake Shack observes the line, the park behind the line, and outside seating. The perspective captures a dense crowd with most bodies and faces obscured and therefore difficult to analyze with traditional machine learning models. In addition, the line is outside in the elements, with snow, inconsistent lighting, shadows, and even umbrellas.
Count creates a density map predicting the likelihood of each pixel being a person, allowing us to then calculate the number of people in the entire frame. We then use another model to determine the line from the crowd: people waiting at the starting point, side by side. Being able to differentiate between a distant crowd and an actual line has its own set of subtle challenges.
Feel free to read more about it at http://blog.dimroc.com/2017/11/19/counting-crowds-and-lines/