MagE Pipeline

documentation on setting up and running the MagE pipeline

Introduction
Requirements for Running the Pipeline
Setting up the MagE database file
Generating Pipelines
Running the Calibration Pipelines
Running the Science Pipelines
Other Optional Parameters
If All Goes Well

1. Introduction

Users of CarPy for spectroscopic data reduction should cite the following papers:

Kelson, D.D., Illingworth, G.D., van Dokkum, P.G., & Franx, M. 2000, ApJ, 531, 159
Kelson, D.D. 2003, PASP, 115, 688

These pages discuss the pipeline written by Dan Kelson http://www.ociw.edu/~kelson. Other helpful tips can be found here (will update this soon):

The pipeline is written in Python http://www.python.org/, using an extensive set of C, FORTRAN, and C++ libraries to do the heavy lifting. The pipeline requires that you have CarPy installed. SBS users should contact Dan or Edward for help on setting up and running the pipeline on their own machines. Instructions on how to download CarPy can be found here.

NOTE: The MagE pipeline is based on the MIKE pipeline; therefore, the steps for running the pipeline, and even the names of the commands that are used, are very similar. If you are familiar with running the MIKE pipeline, then you shan't have hard time learning how to run the MagE pipeline.

Once you have CarPy installed, all you need to do is source the included Setup.csh or Setup.bash file, and you're good to go! For example:

<csh prompt%> source /where/CarPy/is/installed/Setup.csh

Top

2. Requirements for Running the Pipeline

Typically five types of frames are used during the reduction of MagE data:

Xe-Flash frames - used to locate the order edges
Xe-Flash frames (defocused) - used to construct a flat field for the blue end of the spectrum
Dome flats - used to construct a flat field for the red end of the spectrum
Target frames - used for science
Lamp frames - used to map the line curvature and for wavelength calibration

Top

3. Setting up the MagE Database File

The first step is to create a text file that lists the important properties of your MagE FITS files. This file will be referred to as the MagE Database File.

For a quick start, you can generate this database file by typing

<csh prompt%> magedb -d DATADIR

where DATADIR is either a relative or absolute path to the directory in which you have your FITS files. The task "magedb" actually reads the headers of the images and stores the important stuff into a new text file.

For example, suppose you kept your data for an entire observing run in a directory called "/data1/vulcan/edwardv/MagE/sept2011". Then, you would type:

<csh prompt%> cd /data1/vulcan/edwardv/MagE
<csh prompt%> magedb -d sept2011

or if you would like to keep things a bit more organized, you could do the following:

<csh prompt%> cd /data1/vulcan/edwardv/MagE
<csh prompt%> mkdir Reductions
<csh prompt%> cd Reductions
<csh prompt%> magedb -d ../sept2011

After running this step, you will find a file with "MAGE.db" in its name. In the above case we are left with sept2011MAGE.db in the current working directory. This file contains a listing of the absolute paths of the FITS files so it need not sit in the directory in which you will be doing your reductions (even though it can be very handy to have around). Recent updates to the pipeline codes may produce "db" files that look a little different than the above example.

You might typically generate a database file for your entire run, rather than for individual nights. However, if you do generate multiple database files, you can "cat" them together into one (longer) database file. You can combine the data from multiple observing runs in this way, if you essentially wanted your output data products (i.e. extracted spectra) to include data from multiple observing runs.

Important: the MagE Database File pulls information from the headers and thus to use the resulting "MAGE.db" file one must trust the data stored in the FITS headers. Most of the information that is used has not been altered by the observer as those bits of information are generated by the instrument control systems. However, the "OBJECT" keyword is read and thus successful and pleasing use of the "MAGE.db" file supposes that you have fixed errors and/or inconsistencies in the "OBJECT" header data. I highly recommend that you view the "MAGE.db" file in your favorite editor and verify that the information in it is correct.

Top

4. Generating Pipelines

Upon completion of the "database file" describing the observations, one invokes "magesetup" (replacing "sept2011MAGE.db" with whatever your database file is called):

<csh prompt%> magesetup -db sept2011MAGE.db

This will run through the database file in its simplest mode and return something that looks like a list of all the targets in the database file.

#------------------------------------
#  J                         Target
#------------------------------------
   1                      v23900_11
   2                         v32_11
   3                      v19309_11
   4                 cs22964-161_11
   5                    cs22888-014
   6                    cs29513-003
   7                    cs30493-071
   8                    cs22876-032
   9                    cs22882-006
  10                    he0200-5701
  11                    cs22968-029
  12                      sc10-1838
  13                      v15923_12
  14                    hd193901_12
  15                 cs22964-161_12
  16                    he2122-4707
  17                    cs22945-017
  18                    cs22945-029
  19                    sc10-137844
  20                    cs22882-030
  21                        sc4-630
  22                      sc10-1896
  23                      hr1621_12
  24                       v6840_13
  25                         v32_13
  26                    cs22898-047
  27                    cs30492-110
  28                    he2319-5228
  29                    cs22945-058
  30                 cs22964-061_11

Note that these are similar to the OBJECT entries in the database file. The script turns your OBJECT names into things that are a bit more useful for making filenames (e.g. stripping spaces, converting to lower case). One of the reasons for doing this to the object names is to make them a bit more repeatable so that multiple instances of the same target may be more easily matched up. If you see targets listed in there (e.g. funny ones like "bias", or "zero", etc) that you do not really want to reduce, then you should either (1) comment those lines out with a "#", or (2) delete the relevant lines, or (3) put the word "ignore" in the OBJECT entries of those frames (see note below).

You will also notice that nothing else has happened other than the printing of a list of target names. No pipelines or new subdirectories are created when "magesetup" is run in this manner. By running "magesetup" in this way, you will see all of the assigned target names. This is important if you wish to (1) check for bogus targets; (2) check for typos that may have led the scripts to not recognize identical targets for multiple frames; or (3) to ultimately allow you to generate pipelines only for specific targets since you may not know what the scripts will call your science targets. After you are satisfied with the list of processed target names, you may either generate a pipeline for a specific target, or for all targets. To generate it for a specific target:

<csh prompt%> magesetup -db sept2011MAGE.db cs22898-047

This should generate several subdirectories and files. There will now be a makefile with the default name "Makefile", and in it are the commands to (1) generate a new normalized flatfield, (2) generate a good reference wavelength solution for a lamp frame from the red data, and (3) reduce the data for the science target. The flatfields will be called "blueflat.fits" and "redflat.fits" and will later be combined to make "flat.fits" and the reference lamp frame will be called "lamp.fits". In your working directory these will be symbolic links to files within two of the new subdirectories. If you already have a good flatfield from, for example, the previous night then feel free to copy (or link) it into your working directory with the name "flat.fits". In that case "magesetup" will skip setting up a flatfield pipeline and use the one you provided.

If you plan to have separate makefiles in the same directory, you may want to give your makefile another name. To do so, just add the "-mk" argument to the "magesetup" command, followed by the name you want to give the makefile. For example:

<csh prompt%> magesetup -db sept2011MAGE.db cs22898-047 -mk sept2011.make

If you need to regenerate the pipelines and makefiles, including the ones for the flatfield and lamp, your simplest option is to type:

<csh prompt%> make regenerate

Note that if you told "magesetup" to name your makefile something else, like "sept2011.make", for example, then you will need to specify the file each time you use the "make" command:

<csh prompt%> make -f sept2011.make regenerate

The subdirectory in which you will find the science pipeline is called "cs22898-047", in which are located a few files. You can also forgo specifying individual targets and tell "magesetup" to generate subdirectories and pipelines for all targets:

<csh prompt%> magesetup -db sept2011MAGE.db -all

This is the most fruitful way to run things as long as you don't have bizarre "objects" listed as targets (such as "bias", or "zero", or other things that you do not wish to reduce as targets).

After "magesetup" generates all the pipelines, you can now reduce your data by typing:

<csh prompt%> make

IMPORTANT: Please try to cull bad data out of the "MAGE.db" file. Please take a look at the database file and either delete useless frames or place a keyword (such as "ignore") within the OBJECT entry. Alternatively, when you run "magesetup", you can also specify an "-ignore VVVVV" and thus any frame with "VVVVV" in its OBJECT entry will be ignored. The default is the word "ignore".

IMPORTANT:

The first time you run magesetup it will generate subdirectories and pipelines for processing the flats (see below).
The script by default assumes that if an entry in the database has "thar" in the modified OBJECT (i.e., spaces removed, concatenated, forced to lower-case, etc) then the file is a comparison lamp. Run "magesetup" with "-lampkey WWWWW" where "WWWWW" is the string that quasi-uniquely identifies the lamp frames. The lamp exposures are associated with the science frames by searching for the lamp with the nearest MJD (at the midpoint of the exposure), with a small perturbation to search for the frame with an appropriate slit angle.
The script by default assumes that if an entry in the database has "flash" in the modified OBJECT (i.e., spaces removed, concatenated, etc) then the file is a flatfield that will be used for the blue orders. Alternatively, un "magesetup" with "-blueflatkey XXXXX" where "XXXXX" is the string that quasi-uniquely identifies the Xe-Flash frames.
The script by default assumes that if an entry in the database has "dome" in the modified OBJECT (i.e., spaces removed, concatenated, etc) then the file is a flatfield that will be used for the red orders. Alternatively, run "magesetup" with "-redflatkey YYYYY" where "YYYYY" is the string that quasi-uniquely identifies the dome flat frames.
The script by default assumes that if an entry in the database has "flash" in the modified OBJECT (i.e., spaces removed, concatenated, etc) then the file is an Xe-Flash frame that will be used to find the order edges. Alternatively, run "magesetup" with "-slitfnkey ZZZZZ" where "ZZZZZ" is the string that quasi-uniquely identifies the Xe-Flash frames.

Top

5. Running the Calibration Pipelines

The first time you run "magesetup" such that it generates pipelines and subdirectories, you will find the following in your top-level working directory:

Directories called "lamp", "flatblue", "flatred", "flat", and "slit".
Symbolic links to the final images produced in each of the above directories. At the moment, these files do not yet exist.

If you wish to only run the calibration pipelines first, you can type:

<csh prompt%> make calib

The above will run the pipelines in the following order: lamp, blueflat, redflat, flat, and slit. The output for each pipeline is written to a log file within that directory called XXXX.out1 (e.g. lamp.out1). If you happen to run "make calib" another time, then the output will get written to log files named XXXX.out2. And so on.

You can also go into each directory, and run the Makefile there by hand, by just typing "make", but it doesn't print the output to a log file, unless you specify it on the command line. For example:

<csh prompt%> cd lamp
<csh prompt%> make >& lamp.log

After a while you will find a new file called "lamp.fits". This file need only be generated once because the pipelines generated for the science targets will use symbolic links to the "lamp.fits" file. Same thing goes for the other calibration pipelines.

As the individual pipelines run, each individual step, or stage, generates an empty file to denote completion of that stage. These tell "make" what steps remain to be completed. Thus if you run your flatfield pipeline to completion and for some reason, go back to your master working directory and type "make regenerate", the flatfield pipeline will not be rerun by typing "make" alone. When you type "make", it checks to see if there are stages that remain to be executed by looking for these empty "stage-XXXX" files. Those that are missing indicate what will be run by "make".

Similarly, empty files called "targ_XXXX" will also be generated in the top-level directory, once the pipeline for that object is completed. So when you type "make", it'll look for which calibration targets haven't been completed yet when running the top-level Makefile.

Top

6. Running the Science Pipelines

For every target processed by "magesetup", there are new subdirectories, with names equivalent to the target's OBJECT name in the .db file. In each of these subdirectories you will find symbolic links to the relevant normalized flatfield, and you will also find shell scripts that contain the long list of commands for reducing the data for that target.

When you run "magesetup" to generate the pipelines, such as in

<csh prompt%> magesetup -db sept2011MAGE.db -all

you will get a file called "Makefile" and all you have to do is then type:

<csh prompt%> make

to reduce the data for all of your science targets, given that all of the calibration pipelines are finished. Alternatively, you can also type the following to get the science pipelines going:

<csh prompt%> make science

Just like with the calibration pipelines, an empty file called targ_XXXX will be made when the pipeline for each science target has run successfully. This way, if you had to interrupt the pipeline for whatever reason, you can just pick up where you left off.

Top

7. Other Optional Parameters for "magesetup"

7.1. Skipping the Generation and Usage of a Slit Function

If you do not want to derive a slit function, then run magesetup with -noslitfn. This will also remove the step of dividing the frames by the slit function in the science target pipelines.

7.2. Output 2-D Spectra

Running magesetup with -2D will set up the pipelines so that 2-D spectra are also produced at the end, in addition to the usual 1-D extractions. The 2-D object spectra are indicated by XXXX_sum.fits, along with similar FITS files containing spectra of the sky and noise spectra.

7.3. Extractions for Individual Exposures

Running magesetup with -individual will set up the pipelines so that the exposures are not stacked at the end. 1-D extractions will then be produced for each individual exposure, whose names will contain the number from the original exposure.

Top

8. If All Goes Well

Currently many images from intermediate steps are not removed. Also, tons of information are stored and updated in the headers of images. Some postscript plots are generated by particular tasks and these can be helpful when debugging problems.

At the end of running all of the commands in a given pipeline script, you should be left with a "multispec" FITS file, in which is contained, for every order:

Sum of the sky over the extraction aperture.
Sum of the object over the extraction aperture.
Expected noise from sum of object and sky plus the read noise.
Signal-to-noise spectrum (per pixel).
The sum of the lamp spectrum over the extraction aperture. However, this is only from the first lamp exposure since small wavelength shifts between lamp and science frames would cause unwarranted blurring of the lamp lines in this extracted spectrum.

These small shifts are measured in the science frames using the positions of the night sky emission lines. The dispersion solutions for the science frames, as modified by the sky lines, are then technically invalid for the lamp exposures.

The last command in each pipeline is to copy this "multispec" FITS file (the one called XXXX_multi.fits) into the Final-Products/ directory. Thus, after running all of the pipelines, you should (hopefully) only have to look in Final-Products/ to get your extracted spectra.

Top

Resources

Will update resource pages soon!