|
1
|
- David De Young
and the USNVO Collaboration
|
|
2
|
- Astrophysical data is growing exponentially
- Doubling every year (Moore’s Law):
both data sizes and number of data sets
- Computational resources scale the same way
- Constant funding levels will
keep up with the data
- Main problem is the software component
- Currently components are not reused
- Software costs are an increasingly larger fraction
- Aggregate costs are growing exponentially
|
|
3
|
- When and where are discoveries made?
- Always at the edges and boundaries
- Going deeper, using more wavelength bands
- Physicists make many measurements and discard most; Astronomers make
many measurements and find discovery in their entirety and combination
- Metcalfe’s law
- Utility of computer networks grows as the
number of possible connections: O(N2)
- VO: Federation of N archives
- Possibilities for new discoveries grow as O(N2)
- Current sky surveys have proven this
- Very early discoveries from SDSS, 2MASS, DPOSS
|
|
4
|
|
|
5
|
- Exponential growth
- Data will be never centralized
- More responsibility on projects
- Becoming Publishers and Curators
- Larger fraction of budget spent on software
- Lot of development duplicated, wasted
- More standards are needed
- Easier data interchange, fewer tools
- More templates are needed
- Individuals develop less software
|
|
6
|
- Astrophysics has a good track record
- FITS: universally used to share low level data
- Individual images, tables, files
- But: new industry standards emerging
- Requirements of modern data exchange:
- More dynamic (streams, queries)
- Merging heterogeneous sources
|
|
7
|
- Locate data from user supplied source
- Download and study documentation
- Identify necessary data components
- Copy data to local machine
- Read and filter data locally
- Perform the analysis locally
- Time Consuming and Inefficient
|
|
8
|
- Phase 1
- Auto-discovery of data, and documentation
- Study documentation
- Filter (query) data from remote source
- Analyze incoming data stream directly
- Phase 2
- Perform even analysis remotely,
close to the data source
|
|
9
|
- Today
- Accessing remote data:
- WWW, FTP
- Data formatted in certain ways
- Accessing remote computing:
- Hard configured local area clusters
- Remote supercomputers
- Need to move data to the computing
- Available resources do not always match problem
|
|
10
|
- Standardizing distributed data
- Web Services, supported on all platforms
- Custom configure remote data dynamically
- XML: Extensible Markup Language
- SOAP: Simple Object Access Protocol
- WSDL: Web Services Description Language
- Standardizing distributed computing
- Grid Services
- Custom configure remote computing dynamically
- Build your own remote computer, and discard
- Virtual Data: new data sets on demand
|
|
11
|
- THE VIRTUAL OBSERVATORY CONCEPT
- Characteristics
- Distributed
- Science Driven
- Integrated With Information Technology
- Broad Based Community Support
- Builds on Existing Infrastructure
|
|
12
|
- National Academy of Sciences “Decadal Survey” recommended NVO as highest
priority small (<$100M) project
- “Several small initiatives recommended by the committee span both
ground and space. The first
among them—the National Virtual Observatory (NVO)—is the committee’s
top priority among the small initiatives. The NVO will provide a “virtual sky”
based on the enormous data sets being created …”
- —Astronomy and Astrophysics in the New Millennium, p. 14
|
|
13
|
- ORIGINS
- VO White Paper (Alcock, Prince, Szalay): Jun1999
- First NVO Workshop (JHU): Nov 1999
- Formation of Initial Working Groups (Science, Management, Technical):
Nov 1999
- Formation of Interim Steering Cte: Feb 2000
- Second NVO Workshop (NOAO): Feb 2000
- Presentations to NASA and NSF: May 2000
- First Major NVO Meeting (CIT): Jun 2000
- Submission of Proposal to NSF: May 2001
|
|
14
|
- NSF ITR project, “Building the Framework for the National Virtual
Observatory” is a collaboration of 17 funded and 3 unfunded
organizations
- Astronomy data centers
- National observatories
- Supercomputer centers
- University departments
- Computer science/information technology specialists
- PI and project director: Alex Szalay (JHU)
- CoPI: Roy Williams (Caltech/CACR)
- $10M award for five-year period, beginning 1 Nov 01
|
|
15
|
|
|
16
|
|
|
17
|
- Executive Committee
- A. Szalay, R. Williams, R. Hanisch (PM), D. De Young (PS), R. Moore
(SA), G. Helou, E. Schreier
- Education & Outreach
- First Working Groups established
- Metadata (R. Plante/NCSA)
- Systems (R. Moore/UCSD)
- Science (D. De Young/NOAO)
- Project teams established for initial science demonstrations
- GRB follow-up (T. McGlynn/HEASARC)
- Brown dwarf search (B. Berriman/IPAC)
- Cluster galaxy morphologies (R. Plante/NCSA)
|
|
18
|
- Integral part of project
- Emphasis is on development of partnerships
- Initiated with a workshop this summer at STScI (July 11-12)
- Understand requirements on NVO services from perspective of formal
education, informal education, commercial/corporate, and public
outreach content developers
|
|
19
|
|
|
20
|
- Formal management plan delivered
to NSF in January 02
- 11 major work breakdown categories, with sub-elements to three levels
- All level-two technical WBS areas have designated lead who is
responsible for tasks and schedule within that area
|
|
21
|
|
|
22
|
- Nov 2001 – Jan 2002: Established
project structure
- May 2002: Defined initial science
demos
- June 13, 2002: Formed
International VO Alliance
- Nov 15, 2002: Internal testing of
science demos
- January 2003: Initial science
demonstrations (AAS)
- August 2003: Intermediate NVO
science demos (IAU)
|
|
23
|
- Formal Quarterly and Annual Reports to NSF; copied to NASA
- Informal monthly reports to project manager
- Biweekly project status telecons with level-two WBS leaders
- Weekly Executive Committee telecons
- Weekly or biweekly working group telecons (Metadata, Systems, Science)
- Archived e-mail exploders for all working groups and management
discussions
|
|
24
|
- Define commonly used small services
- Build higher level toolboxes/portals on top
- Do not build `everything for everybody’
- Use the “90-10” rule:
- Define the standards and interfaces
- Build the framework
- Build the 10% of services that are used by 90% of the community
- Let the users build the rest from the components
|
|
25
|
- First year: emphasize prototyping
and experimentation, leading to real demos but not necessarily
production-level software or system
- Many IT tools now available; extensive evaluation through prototypes
necessary to refine choices
- Set up framework for more formal
software management (baseline, test, revision control) for a
distributed development effort in year 2
- NSF ITR project is not expected to define and “deliver” the entire NVO
|
|
26
|
- Science demonstrations
- Identified, scoped and scheduled
- Service registry issues
- Needs international coordination (Garching)
- User interface issues
- Need to retrofit existing portals
- EPO requirements
- Impact on metadata standards
|
|
27
|
- Keep focus on user- and science needs
- Identify most common services
- Verify standardization efforts
- Encourage data providers to participate
- Demonstrate to community that NVO tools will
- arrive soon
- will be useful for everybody
- can evolve incrementally
- First science demos planned for January 2003
|
|
28
|
- Brown-Dwarf search
- Distributed query across several archives
- Correlations with non-detections
- Example of typical NVO search
- Gamma-Ray burst
- Event follow up service
- Exercise in standards compliance/interoperabilty
- Galaxy evolution in clusters
- On-the-fly image analysis and pattern recognition
- Exercise in grid computing
|
|
29
|
- European initiatives underway
- Astrophysical Virtual Observatory
funded by European Commission
(€3.3 million, three years)
- AstroGrid, funded by UK e-science
program (£5 million, three years)
- Other international efforts starting:
- Canada (C$4M recently approved), India, Japan, Chile, Germany, Russia,
Australia
- International VO roadmap announced at Garching VO conference, 10 June
2002
- International VO Alliance formed, 13 June 2002
- Regular telecons among NVO, AVO, and AstroGrid leadership
- Frequent technical contacts among partners
|
|
30
|
- AVO
- P. Quinn (co-chair)
- B. Pirenne
- K. Gorski
- F. Genova
- P. Benvenuti
- AstroGrid
- A. Lawrence
- N. Walton (sec’y)
- T. Linde
- Russian VO
- Canadian VO
- NVO
- A. Szalay
- R. Williams (tech coord.)
- R. Hanisch (chair)
- R. Moore
- D. De Young
- G. Helou
- E. Schreier
- G. Djorgovski
- Australia
- German AVO
- India VO
|
|
31
|
- Active collaboration among NVO, AVO, and AstroGrid on VOTable
- V1.0 released on April 15
- Basis for testing metadata models, exchange protocols, encoding
mechanisms
- Continued development of FITS standard
- World Coordinate System definitions
- Framework definition
- Celestial coordinates
- Spectral dispersion relations
- Distortion functions
- Time
|
|
32
|
- NSF ITR NVO project is one of four major and numerous other small
VO-related initiatives now underway world-wide
- NVO is adopting, adapting, or developing necessary technology as derived
from science requirements
- Project management approach seems
to be working based on the first six months experience
- NVO project is dealing with many of the management challenges that will
face the ultimate VO organization
|