RAP: Resource-Aware Perception


Machine learning and data mining algorithms, including Support Vector Machine (SVM), Spectral Clustering and graphical models are useful for a wide-range of applications in areas such as information retrieval, compute vision, network anomaly detection and social network analysis. However, most existing learning algorithms have high computation complexity and require all data being in a central site for analysis. So the computation and/or communication resources required by the methods in processing large-scale data in distributed systems are often prohibitively high, and practitioners are often required to approximate the original data in various ways (quantization, filtering down sampling, etc) before invoking the data mining algorithms.

In this project, we aim to develop a general framework for efficient learning and inference in distributed (mobile) systems. This framework involves in-network processing at distributed sites (devices), and approximate mining at the Reasoning Operation Center (ROC). The combination of distributed local processing strategies, sophisticated learning algorithms, and theoretical analysis tools enable our approach to perform in-network inference and mining which achieves high accuracy with low communication overhead.


Within this framework, we currently: 1) study the tradeoffs between data reduction and the loss in the SVM's classification performance. We derive approximate upper bounds on the perturbation on SVM classification. We show that the bound is empirically tight, making it practical for the practitioner to determine the amount of data reduction given a permissible loss in the classification performance; 2) study the effects of data approximation on the performance of spectral clustering. We show that the error under approximation of spectral clustering is closely related to the perturbation of the eigenvectors of the Laplacian matrix. From this result we derive approximate upper bounds on the clustering error. This bound can be used in practical settings to determine the amount of data reduction allowed in order to meet a specification of permitted loss in clustering performance. This result can also serve as a guideline for developing a class of fast approximate spectral clustering algorithms -- those based on the idea of pre-grouping neighboring points and approximating a (large) data set by a reduced set of ``representative'' points.

RAP is part of Everyday Sensing and Perception (ESP), the megabet project in Intel Research to drive research breakthroughs in sensing and inference which enable a new class of context inference system that are "90% accurate for over 90% of your day". To support context-aware inference and reasoning on mobile devices, we propose the research agenda in Resource-Aware Perception, which focuses on developing efficient algorithms for signal processing, feature extraction, object recognition, and other machine learning based perception on mobile devices. Because only limited computation power and communication bandwidth are available on the mobile devices, the algorithms have to learn and infer useful and high-level ideas with minimal computation and communication cost.

Publications


Researchers


Collaborators