FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:   || 2 | 3 |

«HOW TO MAKE THE DREAM COME TRUE: THE ASTRONOMERS’ DATA MANIFESTO Ray P. Norris CSIRO Australia Telescope, PO Box 76, Epping, NSW 1710, Australia ...»

-- [ Page 1 ] --



Ray P. Norris

CSIRO Australia Telescope, PO Box 76, Epping, NSW 1710, Australia

Email: Ray.Norris@csiro.au


Astronomy is one of the most data-intensive of the sciences. Data technology is accelerating the quality

and effectiveness of its research, and the rate of astronomical discovery is higher than ever. As a

result, many view astronomy as being in a “Golden Age”, and projects such as the Virtual Observatory are amongst the most ambitious data projects in any field of science. But these powerful tools will be impotent unless the data on which they operate are of matching quality. Astronomy, like other fields of science, therefore needs to establish and agree on a set of guiding principles for the management of astronomical data. To focus this process, we are constructing a “data manifesto”, which proposes guidelines to maximise the rate and cost-effectiveness of scientific discovery.

Keywords: astronomy, data management, virtual observatory

1. INTRODUCTION The last few years have seen a revolution in the way astronomers use data. An astronomer can type the name of an object into a web page, and instantly view a wide range of observed data on that object, obtain references to all publications that mention it, and even produce plots of the spectral energy distribution (SED: Fig 1).

Figure 1: A typical Spectral Energy Distribution (SED) generated automatically by the NASA/IPAC Extragalactic Database (NED) using data collected by many different authors and instruments.

This SED includes data obtained from many different instruments using different technologies, calibration processes, and data formats, brought together in data centres that understand the instrumentspecific metadata. All papers written in the astronomical journals are available on-line through a powerful engine that searches the entire body of astronomical literature. Many such papers will contain links to other publications and data.

The Virtual Observatory (VO) promises to place even more power at the hands of the astronomer, and with it the capability of accelerating the rate of scientific discovery. The VO will enable the astronomer to search all available databases in a region of sky, and superimpose or combine the images, or produce a plot comparing measurements on different instruments. Some of us dream even further. For example, I look forward to the day when I can move my mouse over an image I have just produced, and the VO dynamically gives me all available information, in the form of graphs, images, and literature, about the position underneath my cursor.

How do we turn these dreams and promises into reality? One requisite is obviously to build the necessary tools, services, and data structures, and the VO is doing just that. However, these tools will be ineffective without high-quality data on which to operate. While data from major international observatories, such as the European Southern Observatory and NASA’s Great Observatories, are now freely available and managed in a way that is difficult to fault, much, perhaps most, of the remaining astronomical data and information are still relatively inaccessible. Even worse, some of these data are so poorly managed that they will be lost.

One of the reasons for poor data management is that many astronomers and observatory directors are unaware that good data management can generate good science, and that bad data management can inhibit the process of scientific discovery. Furthermore, in many areas there is not even a consensus on what constitutes good data management (Norris, 2005; Norris et al. 2006).

In an attempt to stimulate a discussion that might lead to such a consensus, and to promote awareness of these issues, a group of astronomers recently established “An Astronomers’ Data Manifesto”.

In this paper, I discuss the successes and challenges of astronomical data management, and describe the manifesto and its purpose.


2.1 The Astronomical Literature Virtually all papers in the fields of astronomy, astrophysics, and related areas are referenced by the Smithsonian/NASA Astrophysics Data System (http://www.adsabs.harvard.edu/), known colloquially as the ADS. It references not only the mainstream journals, but conference reports, theses, preprint servers, and even institutional technical reports, where they are made public. It includes links to the papers in all their published forms, so that, for astronomers with institutional access to the journals, this provides transparent access to the entire astronomical literature. Even authors who publish papers (such as this) in non-astronomical journals can request to have their paper listed by the ADS. Powerful facilities enable a search to be made by author, title, keyword, or even text contained within the paper.

As a result, the ADS has probably become the primary entry point to the published literature for most astronomers.

To disseminate their new research results, most authors now submit preprints (usually after acceptance by a journal) to arXiv.org (http://www.arxiv.org/list/astro-ph/new). This has become the primary means of accessing new research results, and many astronomers check it daily. Basic search facilities are also available, although ADS probably remains the most flexible way of accessing the arXiv.org contents.

The principal commercial astronomical journals have responded positively to these changes, and their electronic editions have become the main journals of record. It is likely that paper editions will be phased out within a few years. However, it is unclear how the astronomical publishing paradigm will

change, given a number of conflicting forces:

1. There is a growing demand for open access, or free, journals, particularly as a solution to the “Digital Divide” discussed below. However, it is not yet clear how an open-access journal can afford to maintain the editorial quality and peer-review processes currently offered by mainstream journals.

2. Since most astronomers now access the literature via ADS or arXiv, the “title” of a journal has become less important. While there is still prestige associated with publishing in a highimpact journal, the actual visibility is similar regardless of where the paper is published, and so in time the impact factor may cease to differentiate journals. The commercial journals will therefore need to offer additional value, compared to open access journals, if they are to retain their authors and readers.

3. Some of the main journals have an excellent track record of responding to the changing demands of the astronomical community, and of promoting initiatives such as electronic access to associated data and tables, and linkages to other data centres. As a result, there is a significant groundswell of support for such journals from within the astronomical community.

2.2 Astronomical Data Centres Astronomy enjoys a number of first-class data centres, the best known of which are CDS and NED.

NED is the NASA/IPAC Extragalactic Database (http://nedwww.ipac.caltech.edu/), which offers access to data taken from the literature and from major astronomical surveys. Its search engine provides all available data on an object or position in the sky, for which it will list measured data, images, references to the literature, and even some interpretation by comparing measurements made at different wavelengths by different authors and instruments (Fig 1). The difficulty of accomplishing this latter feat should not be underestimated, as authors use different metadata, jargon, and (even if they don’t know the word) ontologies. As well as providing access to data, NED also provides a number of tools and innovative facilities such as its knowledgebase. Its key constraint is that it is designed to include only extragalactic objects (i.e. objects lying outside the Milky Way) and so does not include, for example, stars within the Milky Way, or solar-system objects. Nevertheless, for those who focus on extragalactic astronomy, NED has become the primary tool for accessing data.

CDS is the Centre de Données Astronomiques de Strasbourg (http://cdsweb.u-strasbg.fr/). Like NED, it offers access to data from the literature and from major surveys, and provides tools and search engines to access and interpret that data. It differs from NED in that it aims to include data on all astronomical objects outside the solar system, whether extragalactic or galactic. The main databases at CDS are Vizier, which includes nearly all major published surveys and tables, and Simbad, which provides search tools to access data taken both from the literature and form surveys. A number of other powerful tools are also provided, such as Aladin which enables a user to superimpose images from several data sources, including personal files. Just as important are the CDS research and development programs, which have been influential in shaping the way in which astronomers use data, and continue to be important drivers in the development of the VO.

A number of other major data centres around the world, such as those in Canada, China, Japan, and Russia, offer significant features for particular purposes. In addition, a number of specialised data centres exist to serve data for particular instruments or classes of instrument, such as NASA’s High Energy Astrophysics Science Archive Research Center (http://heasarc.gsfc.nasa.gov/). Furthermore, the electronic data provided by the journals themselves effectively constitute a data centre, a blurring which is increasing as journals explore innovative projects such as those which offer to store authors’ source data. Finally, it is important to acknowledge the regrettable closure of a major data centre (NASA’s Astrophysical Data Center) in 2002, which serves as a warning against any complacency that high-performing data centres are immune to threats of closure.

2.3 Linkages between Literature and Data Centres Many astronomers assume that data centres such as NED and CDS give them access to essentially all the published data. However, Andernach (2006), who has conducted a case study of over 2000 published articles, finds that typically only about 50% of results published in journals ever appear in the data centres, and lists some surprising and significant omissions.

It is not hard to understand why. At present, when authors submit data such as tables, spectra, or images, to journals, they do so in a variety of formats. The meaning of the axes or columns is often only apparent after reading the captions or the body of the paper, and authors continue to use jargon which is opaque to anyone outside the immediate field. Even worse, formatting errors still occur in published data tables, further impairing attempts at machine-readability.

To incorporate these data into a data centre requires a knowledgeable expert to interpret the words of the author, so that the results can be translated into a standard form. Given the finite resources of the data centres, and the expanding volume of astronomical literature, the availability of such knowledgeable experts then becomes a bandwidth bottleneck between the literature and the data centres.

Naturally, users would like to see all peer-reviewed results appear in the data centres. This will become increasingly important as VO tools become more widely used. In the same way that, at the moment, an astronomical publication that does not appear in ADS is effectively invisible to the astronomical community and will probably never be cited, in a few years time a result or image that is not accessible by the VO may never be used, generating wasteful repeated observations and slowing down the rate of scientific discovery.

How can we ensure that all validated astronomical data appear in the data centres? One solution would be to increase funding to the data centres so that they can employ enough knowledgeable experts to interpret all published data, but the finite available resources make this option unlikely.

An alternative is to find ways of automatically transferring published data from journals to data centres.

This would probably require that the authors provide data in standard formats and that they provide the necessary metadata to interpret them. Then, if an author chooses to supply these metadata, and certifies that the data have been checked using appropriate tools, they could be imported automatically into the data centres.

This effectively redistributes the transcription workload from the data centres to the authors, and necessarily entails more work for authors. However, they benefit from the greater scientific impact and the higher citation rate that will result from their data being in the data centres. In many cases the paper itself will benefit from this further level of checking.

There is a potential disadvantage to such a system, in that it increases the likelihood that simple formatting errors in published papers might ultimately reduce the quality of the data in the centres. It remains to be seen whether automated checking procedures can reduce this possibility to the level where the disadvantage is outweighed by the advantage of doubling the quantity of high-quality information offered by the data centres.

3 OPEN ACCESS Most astronomical data are unfettered by intellectual property or confidentiality issues, other than widely supported exceptions such as initial protection of observers’ data by major facilities. As a result, astronomical archive data are generally available to all astronomers at no charge. It is this tradition which has enabled the success of astronomical data centres, and which will be vital for the success of the VO. The adoption of an open-access policy is not just for public good. For example, the Hubble archive results in roughly three times as many papers as those based on the original data (Beckwith, 2004). Similarly, the International Ultraviolet Explorer (IUE) archive increased the usage of IUE data by a factor of 5 (Wamsteker & Griffin, 1995). So, in principle, observatories might multiply their scientific output by making their archive data public. Since the funding for most major observatories depends on performance indicators such as publications and citations, it may be an expensive decision for an observatory not to adopt an open-access policy.

Pages:   || 2 | 3 |

Similar works:

«Session 2302 Laboratory Instruction in Undergraduate Astronautics Christopher D. Hall Aerospace and Ocean Engineering Virginia Polytechnic Institute and State University Introduction One significant distinction between the “standard” educational programs in aeronautical and astronautical engineering is the extent to which experimental methods are incorporated into the curriculum. The use of wind tunnels and their many variations is firmly established in the aeronautical engineering...»

«High-Performance Compression of Astronomical Images Richard L. White Joint Institute for Laboratory Astrophysics, University of Colorado Campus Box 440, Boulder, CO 80309 and Space Telescope Science Institute 3700 San Martin Drive, Baltimore, MD 21218 rlw@stsci.edu Summary Astronomical images have some rather unusual characteristics that make many existing image compression techniques either ine ective or inapplicable. A typical image consists of a nearly at background sprinkled with point...»

«Young Explorer Series Q & A Apologia’s Young Explorer Series An elementary science curriculum that is truly God honoring, user friendly, and scientifically sound Q. What titles are available in the Young Explorer Series?A. There are currently seven science titles available: • Exploring Creation with Astronomy Students learn about the major structures of our solar system, starting with the sun and working outwards, eventually covering the stars and galaxies that make up God’s incredible...»

«CHRISTOPHER SCOTT EDWARDS, Ph.D. Northern Arizona University • Department of Physics and Astronomy PO BOX 6010 • Flagstaff, AZ 86011 • Phone: (928) 523-7234 Email: Christopher.Edwards@nau.edu • Website: christopherscottedwards.com Curriculum Vitae EDUCATION: 2012 Ph.D. in Geological Sciences, School of Earth and Space Exploration, Arizona State University, Tempe, AZ 2009 M.S. in Geological Sciences, School of Earth and Space Exploration, Arizona State University, Tempe, AZ 2007 B.S. in...»

«Uncloaking globular clusters in the inner Galaxy1 Javier Alonso-Garc´ ıa Departamento de Astronom´ y Astrof´ ıa ısica, Pontificia Universidad Cat´lica de Chile, o 782-0436 Macul, Santiago, Chile Department of Astronomy, University of Michigan, Ann Arbor, MI 48109-1090 jalonso@astro.puc.cl Mario Mateo Department of Astronomy, University of Michigan, Ann Arbor, MI 48109-1090 mmateo@umich.edu Bodhisattva Sen Department of Statistics, Columbia University, New York, NY 10027...»

«1 F.W. Longbottom: astronomical photographer and founder of the Chester Astronomical Society Jeremy Shears Abstract Frederick William Longbottom FRAS (1850-1933) was an original member of the BAA and served as Director of its Photographic Section between 1906 and 1926. A hop merchant by trade, he spent much of his life in Chester where he was instrumental in founding the City‘s first astronomical society in 1892. Introduction In his book ―The Victorian Amateur Astronomer‖ (1), Dr Allan...»

«Advances in Astrophysics, Vol. 1, No. 1, May 2016 1 Fundamental Principles and Results of a New Astronomic Theory of Climate Change Joseph J. Smulsky Institute of Earth’s Cryosphere, Malygina Str. 86, PO Box 1230, Tyumen, 625000, Russia Email: jsmulsky@mail.ru Abstract. In light of the latest research developments, this paper describes the fundamental principles of the astronomic theory of climate change. It comprises three problems: the evolution of the orbital motion, the evolution of the...»

«MEGALITHIC SITES IN BRITAIN BY A. THOM Chapter 9. The Calendar Chapter 10. Lunar Declinations OXFORD AT THE CLARENDON PRESS [ 1971 ] CONTENTS 1. Introduction 1 2. Statistical Ideas 6 3. Astronomical Background 14 4. Mathematical Background 27 5. Megalithic Unit of Length 34 6. Circles and Rings 56 7. The Compound Rings 84 8. Megalithic Astronomy 92 9. The Calendar 107 10. Indications of Lunar Declinations 118 11. The Outer Hebrides 122 12. A Variety of Sites 135 13. The Extinction Angle 163 14....»

«International Review of Research in Open and Distributed Learning Volume 16, Number 1 February – 2015 Astronomy for Astronomical Numbers: A Worldwide Massive Open Online Class Chris D. Impey, Matthew C. Wenger, and Carmen L. Austin University of Arizona, Tucson, United States Abstract Astronomy: State of the Art is a massive, open, online class (MOOC) offered through Udemy by an instructional team at the University of Arizona. With nearly 24,000 enrolled as of early 2015, it is the largest...»

«Digital Zenith Cameras – State-of-the-Art Astrogeodetic Technology for Australian Geodesy Christian HIRT, Beat BÜRKI, Sébastien GUILLAUME and Will FEATHERSTONE Key words: digital zenith cameras, vertical deflections, geodetic astronomy, geoid, quasigeoid SUMMARY Over recent years, significant progress has been made in astrogeodetic research with the development of digital zenith camera systems (DZCSs) at ETH Zurich, Switzerland, and the University of Hanover, Germany. The use of charged...»

«Bull. Astr. Soc. India (2013) 41, 1–17 The discovery of quasars K. I. Kellermann National Radio Astronomy Observatory, 520 Edgemont Road, Charlottesville, VA, 22901, USA Received 2013 February 01; accepted 2013 March 26 Abstract. Although the extragalactic nature of quasars was discussed as early as 1960, it was rejected largely because of preconceived ideas about what appeared to be an unrealistically high radio and optical luminosity. Following the 1962 occultations of the strong radio...»

«CHINESE MATHEMATICAL ASTROLOGY The ability to predict has always been, and remains, an important aim of science. In traditional China, astronomers devised methods of divination that were not only applied to natural events such as weather forecasting, but also to mundane human affairs. The three most sophisticated devices were shrouded in clouds of secrecy. During the eleventh century and for hundreds of years thereafter, candidates were examined on their knowledge of these devices behind the...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.