The development of Data Mining Tools for Virtual Observatory

 

Iakov Pustilnik

 

PhD Student at Faculty of Cybernetics, Moscow Engineering and Physics Institute, Moscow, Kashirskoe av. 31, Russia,
e-mail: xyapus@gmail.com

 

 

We present the development of VO-ready Data Mining application that is originally aimed to make automatic supervised classification for objective prism spectral surveys. The core methods made available to the end-user of this software include spectral-oriented Principal Components Analysis, Support Vector Machines and different Neural Networks methods (Linear Vector Quantization and Multi-Layer Perceptron). The application is to support deep VO integration, including PLASTIC user-interface integration for building learning datasets and configuring the classification rules, VOSpace integration to store these data server-side, and the web-service oriented architecture to free the end-user from transferring large datasets and make composite calculations.

A brief comparative overview of the different data mining techniques commonly used in astronomy was accomplished to select the optimal methods to implement. Some preliminary results of application of this approach for extracting actively star-forming galaxies from the Hamburg Quasar Survey objective prism spectra are presented and discussed. These results show good prospects for an efficient use of huge spectral databases the importance of the creation of such a full-featured data mining system for Virtual Observatory.