:tocdepth: 2 Binary formats useful for astronomers ************************************* Pyfits - Reading and writing fits files ======================================= `PyFITS `_ is a Python module developed at STScI to read and write all types of fits files. The full html manual is available `here `_. The pdf version of the `manual `_ is more than hundred pages long. .. admonition:: External resource! The PyFITS tutorial itself is good for our purpose and since this is the internet I will not reinvent the wheel. Read the `Pyfits tutorial `_ then come back to this page for an exercise. .. admonition:: Exercise Use the following code to download a fits file for this exercise:: import urllib2, tarfile url = 'http://python4astronomers.github.com/core/core_examples.tar' tarfile.open(fileobj=urllib2.urlopen(url), mode='r|').extractall() cd py4ast/core ls Read in the fits file. Find the time and date of the observation, and read the intensity array. .. raw:: html

Click to Show/Hide Solution

Here is a possible solution:: import pyfits hdus = pyfits.open('3c120_stis.fits.gz') hdus.info() head = hdus[0].header head.keys() # lists all keywords in a dictionary head['TDATEOBS'] head['TTIMEOBS'] img = hdus[1].data # Intensity data We can use ``plt.imshow()`` (matplotlib again) to display the intensity array using some sensible minimum and maximum value so that the spectrum is visible:: plt.clf() plt.imshow(img, origin = 'lower', vmin = -10, vmax = 65) plt.colorbar() We will revisit this piece of code in the :doc:`../core/numpy_scipy` part of the tutorial. .. raw:: html
``pickle``: easy binary persistence =================================== The Pickle package is in the Python standard library, is useful to store arbitrary objects to a file. >>> import cPickle as pickle >>> l = [1, None, 'Stan'] >>> pickle.dump(l, file('test.pkl', 'w'), protocol=2) >>> pickle.load(file('test.pkl')) [1, None, 'Stan'] The ``protocol`` keyword specifies the algorithm used to convert the object into a binary file. ``protocol=2`` is fastest and creates the smallest files, and should be preferred. A drawback of the pickle format is that earlier versions of Python than the version used to create the pickle may not be able to read it. The higher the protocol level (level 2 is the highest), the less likely an earlier version of Python can read the file. For this reason it's best to restrict pickling to saving temporary results. For long-term storage other binary formats, such as FITS or HDF5, are better. ``json``: text file persistence =============================== The ``json`` package in the standard library is similar to the ``pickle`` package, but it allows you to store Python objects as text files in the json format, a sub-set of the XML format. In general this is inefficient, as text files are slow to create and read compared to binary files, and often result in larger file sizes. In addition, only basin Python types can be saved (not Numpy arrays, for example). However, they have the advantage of being easy to read in a text editor, and the json format can be easily parsed by many other programming languages (Java and C for example). The process for creating and loading json files is similar to pickling:: import json d = {'name':'G0001', 'RA': 2.34531, 'Dec': -40.0112} json.dump(d, file('coord.jsn', 'w')) d = json.load(d, file('coord.jsn', 'r')) >>> d {u'Dec': -40.0112, u'RA': 2.34531, u'name': u'G0001'} .. note:: **Unicode:** The ``u`` in front of the strings in the dictionary after the json file has been read stands for unicode. This is because the strings are saved as unicode. Mostly you can ignore the difference between ascii and unicode strings for scientific programming, but for more information see `here `_. Unifying the ascii and unicode types into a single string type is one of the major changes between Python 2.x and Python 3. Reading IDL .sav files ====================== IDL is still a very common tool in astronomy. While IDL packages exist to read and write data in simple (ASCII) or standardized file formats (fits), that users of all platforms can use, IDL also offers a binary file format with an undocumented, proprietary structure. However, acess to this file format (usually called ``.sav``) is very simple and convenient in IDL. Therefore, many IDL users dump their data in this way and you might be forced to read ``.sav`` files a colleague has sent you. Here is an examplary ``.sav`` :download:`file <../downloads/myidlfile.sav>`. If you have trouble downloading the file, then use IPython:: import urllib2 url = 'http://python4astronomers.github.com/_downloads/myidlfile.sav' open('myidlfile.sav', 'wb').write(urllib2.urlopen(url).read()) ls What can you do? 1. Convert your colleague to use a different file format. 2. Read that file in python. If you have a relatively recent version (at least 0.9) of ``scipy`` then this is a matter of two lines:: from scipy.io.idl import readsav data = readsav('myidlfile.sav') If your scipy is older, then you need to install the package `idlsave `_ yourself. (Go back to :doc:`../installation/packages` for details on package installation.) In a normal terminal (outside ``ipython``) do:: pip install --upgrade idlsave or, if you install packages as root user on your system:: sudo pip install --upgrade idlsave Then import the package and read the data:: import idlsave data = idlsave.read('myidlfile.sav') .. admonition:: Exercise: Where is your data? ``idlsave`` already prints some information on the screen while reading the file. Inspect the object ``data``, find out how you use it to access the ``x`` and ``y`` data in it. Note you may get an error saying ``KeyError: 'rewrite'``. This can be ignored (it's a bug involving IPython and SciPy). You can still use tab completion to look at the attributes and methods of ``data``. .. raw:: html

Click to Show/Hide Solution

``data`` is a dictionary and all the variables in the ``.sav`` file are fields in this dictionary. You get a list with ``data.keys()``. Then, this is easy:: data['x'] data['y'] .. raw:: html
Note that ``idlsave`` cannot write files and that it will fail to read if the ``.sav`` file contains special structures like system variables or compiled IDL code. .. include:: ../references.rst