We contribute a scalable, open source implementation of the Pooled Time Series (PoT) algorithm from CVPR 2015. The algorithm is evaluated on approximately 6800 human trafficking (HT) videos collected from the deep and dark web, and on an open dataset: the Human Motion Database (HMDB). We describe PoT and our motivation for using it on larger data and the issues we encountered. Our new solution reimagines PoT as an Apache Hadoop-based algorithm. We demonstrate that our new Hadoop-based algorithm successfully identifies similar videos in the HT and HMDB datasets and we evaluate the algorithm qualitatively and quantitatively.
C. Mattmann, M. Sharan. Scalable Hadoop-Based Pooled Time Series of Big Video Data from the Deep Web. Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR), Bucharest, Romania, June 6-9, 2017. pp 117-120. DOI: 10.1145/3078971.3079019This material is based upon work supported by the National Science Foundation under Grant No. 1639675. Opinions, findings, conclusions or recommendations expressed are those of the authors and do not reflect the views of the NSF.