Example of a coreset cluster. The coreset point is indicated in red. Highly similar views are clustered together indicating that the coreset is effective in removing redundancy in the data.

Example of a coreset cluster. The coreset point is indicated in red. Highly similar views are clustered together indicating that the coreset is effective in removing redundancy in the data.

Given an image stream, we demonstrate an on-line algorithm that will select the semantically-important images that summarize the visual experience of a mobile robot. Our approach consists of data pre-clustering using coresets followed by a graph based incremental clustering procedure using a topic based image representation. A coreset for an image stream is a set of representative images that semantically compresses the data corpus, in the sense that every frame has a similar representative image in the coreset. We prove that our algorithm efficiently computes the smallest possible coreset under natural well-defined similarity metric and up to provably small approximation factor. The output visual summary is computed via a hierarchical tree of coresets for different parts of the image stream. This allows multi-resolution summarization (or a video summary of specified duration) in the batch setting and a memory-efficient incremental summary for the streaming case.
  • [PDF] R. Paul, D. Feldman, D. Rus, and P. Newman, “Visual Precis Generation using Coresets,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 2014.
    [Bibtex]

    @inproceedings{PaulICRA2014,
    Address = {Hong Kong, China},
    Author = {Rohan Paul and Dan Feldman and Daniela Rus and Paul Newman},
    Booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
    Month = {May},
    Pdf = {http://www.robots.ox.ac.uk/~mobile/Papers/2014ICRA_paul.pdf},
    Title = {Visual Precis Generation using Coresets},
    Year = {2014}}