Finding Precise Coordinates for Images

John Thorstensen, Dartmouth College

Revised 2003 October

Table of Contents:

Overview:

This document describes software installed on Agung and Hill (the Linux boxes) which sets up the so-called world coordinate system (or WCS) in your direct image headers. The WCS is a precise RA and dec solution in the image header. Once an image has the WCS set up correctly, you can instantly get a precise RA and dec for any point on your image. A rich set of applications use the WCS for mosaicking, aligning optical data with images from other sources, and so on.

Basically, the procedure described here (1) automatically finds stars in your pictures, (2) matches them to entries in the huge USNO A2.0 star catalog, and (3) puts the coordinate solutions from these stars into the headers. The software is highly automated and can process many images quickly and conveniently, or can be used on single images. The automatic algorithms don't always work perfectly, but the software includes tools which can ferret out these cases and help you fix them efficiently. When things go smoothly, as they will with most images from single MDM CCDs, the process is fast and almost effortless.

The procedures here come in two flavors -- single-image or batch. Doing many images in batch is efficient and also makes possible some very useful consistency checks.

The first part of this document gives important background information and quick instructions. The rest gives more details, including suggestions for how to proceed when things don't work (this material is mostly toward the end).

1. Important Considerations - READ THIS FIRST

Brief Instructions for Use

Doing a Single Image

That's all there is to it!

You can tell if the WCS was installed by typing imhead imagename long+ in IRAF or pyraf; you should see various WCS keywords at the end of the image header, stuff like CTYPE1, CRVAL1, CRPIX1, CD1, and so on. If they aren't there, there was some kind of trouble writing to the image.

So it ran -- but is it right? You can get a good idea by common sense: Are a reasonable fraction of the stars found? Are the residuals small (i.e., under about 0.8 arcsec for USNO A2.0, under about 0.2 for UCAC2)? Is the rotation angle good? You can also check the match with the following procedure:

Here's something nice: if you display the wcs'd image in native DS9 (by reading it in directly, not displaying from IRAF), the RA and dec of the cursor will display real-time! (If the frame appears all black, select scale, then zscale to reproduce the IRAF autoscaling. DS9 knows the whole range of pixel values, not just the 200 or so which ximtool uses. A DS9 tutorial is linked here.

What it if fails?

The last few pages of this document for cleaning up cases in which the automatic matching failed. Often the header information is screwed up. Sometimes some hand work with SExtractor output will give a better catalog of detected stars to work with than the automated version does (more detail here). Options available in handmatchusno can be used to rescue tough cases with a little effort on your part -- as little as possible, I hope!

Doing a batch of images.

This is a little more complicated than doing a single image, but running the following pipeline is much less work than doing the solutions individually, and it enables some nice automatic consistency checks. Here's how:

Note that the procedure affects some important header information. At MDM, the keywords RA, DEC, and EQUINOX are used by CCDCOM to record the telescope coordinates and their equinox (i.e., the numbers derived from the encoder readings). The WCS procedure commandeers the EQUINOX keyword for its own purposes, and there's no way around it except to reverse-engineer the WCS, which is a sizable job. For this reason the script runs a program which probes the headers and, if it appears appropriate, preserves `safe' copies of the original information in the new keywords TELRA, TELDEC, and TELEQNOX. This alone would leave the problem that the RA and DEC could disagree with the (new) EQUINOX keyword -- if you'd taken your data with the telescope readout in 1950, say, the RA and DEC would be for 1950, but the EQUINOX would read 2000 (which is the WCS default). To avoid this problem, I've included a step in the pipeline to load the computed coordinates of the exact field center into the RA and DEC fields.]

Gory Details

This section describes how these procedures work in more detail, which may be of interest to some. If the canned procedure doesn't work for you, you may be able to understand the problem and fix it by reading this more detailed material.

The main procdures ending with .py are standalone python-language scripts. These organize everything, handle numerous pesky details, call IRAF tasks through the pyraf mechanism as needed, and run the C-language programs which do the grunt work -- the marvelous SExtractor package and my own C-language programs matchusno2 and checkwcs2. [Though complicated, this procedure is considerably more streamlined than the first versions, in which perl scripts were used to generate IRAF cl scripts, which then had to be executed in a separate step.]

I'll go through everything here in the order it happens.

The single-image processing program make1match.py reads the name of the image from the command line; the batch pipeline makematch.py reads the command-line argument, opens a file of that name, and begins reading image names from that file. The processing of each image proceeds as follows; the procedure is similar in the two cases until the end of the process, where it forks somewhat. Note: In what follows I use imagename as the generic image name, which might actually be (e.g.) ccd.123.

Preparing the headers. As noted earlier we want to preserve the original telescope pointing information somewhere. The python scripts first check if a TELRA keyword exists, and if it doesn't (meaning the data are fresh) it copies the RA, dec, and equinox keywords over to TELRA, TELDEC, and TELEQNOX. This was formerly done with an elaborate C program called prephead, which is no longer needed.

Getting information from the header. The program calls the IRAF hselect command to query the image header for the following information:

telra  teldec  teleqnox  naxis1  naxis2  secpix1 filter
The result is handed back through pyraf's handy "Stdout=1" mechanism, and written out to a little file called imagename.hsel. If this information is not present, or wrong, the procedure will fail The filter information is useful but not essential, but all the rest is strictly necessary.

Note that the previous step uses MDM-specific keywords (e.g., the telescope coordinate equinox needs to be stored as EQUINOX, and the scale has to be SECPIX1). If you're adapting this procedure to other sites, better fix this.

In the matching step later we need to know what part of the star catalog to use, so the telescope coordinates are used to figure out the search center, naxis1 and 2, and secpix1 are used to determine the field size. Secpix1 (the image scale in arcsec per pixel) is crucial later when it comes time to match the stars -- if this is seriously off (e.g., using the 2.4m parameters on the 1.3m), you'll have to fix it in the headers and re-run the script. The filter information is not absolutely essential, but if this field starts with U, B, V, R, or I it will use the USNO color information to guesstimate magnitudes of the stars in the picture's passband. These are used to sort the USNO list to better match the brightnesses in the picture, which improves the chances of a quick and correct match. If the program can't make sense of the filter information it uses the USNO r magniitudes to sort the list.

Extracting objects with SExtractor. Next, the program calls the incredible SExtractor program through python's os.system method. SExtractor chews through the picture and generates a list of objects found within it. There are a number of configuration files which I've tuned up for the task at hand and put in a centralized place /usr/local/pkg/thor/sexdef. The configuration files are set up to:

SExtractor generates a file called test.cat, which is transient. For more detail, see: http://www.eso.org/science/eis/eis_doc/sex2/sex2html/sex2_doc.html.

Filtering the object list with cat2seeing This C-language program reads the test.cat file generated by SExtractor and filters it to find real stars. It generates as output a file called imagename.cat. First, it filters the output for cosmic rays using three different tests (the `magic numbers' given here are defined constants in the code, which can be changed easily on recompilation. If you need to do this, copy and make your own private versions).

The FWHM seeing is then estimated from the mode of the FWHMa of the remaining detections, and only detections with FWHM within a reasonable margin of the modal FWHM are retained. This step can fail if there are very few bona-fide stars on an image. There's an exception: objects which are larger than the FWHM, and which have a maximum pixel with more than 30,000 counts, are taken to be saturated stars, and are accepted. If you've done something screwy with the middles of the bright stars (e.g., lopped off the peaks at some smaller data value), they won't be counted properly. Finally, detections near the limit are rejected. The output imagename.cat file contains the modal FWHM on the first line (which starts with a #), followed by the stars in order from the brightest to the faintest.

There are other procedures for finding stars in images (e.g., DAOFIND), but the ones I've seen need an initial guess as to the FWHM. This is fine when you're poring over a precious summary image of some cluster, but it's far too much work with hundreds of images. The SExtractor procedure used here usually does a reasonable job with no human intervention. DAOFIND is also problematic when it comes to saturated images, which the modified SExtractor command seems to deal with reasonably well.

Finding the matches with matchusno2. The python script runs this C-program using the os.system method. The program takes the imagename as input, and then does the following.

The matching algorithm used is mind-numbingly simple but apparently original. It may not be as elegant as some, but it uses two important pieces of prior information we have here which aren't usually assumed in the general problem, namely (1) we know the image scale in arcsec per pixel quite accurately (from the imagename.hsel file), and (2) we have some idea of the brightnesses of the stars. Thus the algorithm starts with the brightest USNO star, and tentatively assumes that this is the same as the brightest star in the image. It then takes the next brightest USNO star and assumes it's the next brightest star in the image. With two tentative matches, it computes an implied image scale. If this matches the real image scale within half a percent, it starts looking for still more matches. If a third match is found which preserves the image scale with the first two matches, it computes a transformation and starts looking for still more matches. In order to be added to the match list at this point, a star has to fall within 1 arcsec of where it should be. If four or more matches are found, it assumes it's finished and mops up by looking for all further matches. Until it finds four matches, it keeps working its way down the star list, trying different correspondences. This algorithm as coded has something like 6 nested loops and runs formally as N6, but in practice it's much faster because it uses the prior information about magnitudes to make good first guesses, and the prior information about the image scale to stay out of the innermost loops until there's a good prospect of success. In most cases a match is found almost immediately. The number of stars in each list is limited to 39, and running down all the possibilities when there are no real matches takes maybe 20 sec on a fast Linux box.

Note that it is quite possible for the algorithm to find four spurious matches and declare victory. This is an insidious error. In single-frame mode, you should examine the output critically to satisfy yourself that this hasn't happened. In batch mode the checkwcs2 processing step can be set up to apply several stringent tests to the image set. Read on.

As noted above, once the match is found among the brightest stars, an iterative search is made for correspondences between all the detected stars and all the USNO stars. There's an interesting technical wrinkle here: in the first pass, the parity of the image is unknown, and a 6-constant plate model is used which does not specify the parity of the image (or the perpendicularity of the axes). By comparing the signs of some of the fit parameters, it's possible to deduce the image parity from this fit. For subsequent fits a four-constant model is used, which requires the image parity to be known a priori. The four-constant model in effect fits a scale factor, a pure rotation, and a zero-point shift -- that is, we assume that the CCD has square pixels and perpendicular axes. It's necessary to use this less forgiving fit (once the parity is established) because the situation sometimes arises where all the stars matched are nearly along a diagonal of the picture; because the 6-constant model does not enforce perpendicularity of the axes, the fit then goes bad in the empty corners. It's theoretically possible to have a the initial fit give the wrong parity if the stars are nearly co-linear, but it's more common for the automatch to be completely spurious.

Note that the USNO RA and dec are transformed to tangent-plane coordinates eta and xi before any fitting is done. Thus there's no worry about distortions near to the pole, etc. [However, the USNO catalog lookup software fails very close to the actual pole, where more than 15 min of time is covered by a single picture. Rarely a problem.] You can find a nice discussion of this trick in Taff's Computational Spherical Astronomy, or in any textbook of spherical astronomy. The key idea is that xi and eta are excellent local Cartesian proxies for RA and dec.

Finally, a word about the compression of the USNO used here. At first I'd hoped to use the USNO SA2.0 for this, which is a compressed version of the USNO A2.0. However, the method by which stars were selected for the SA2.0 was evidently to choose stars nearest to specified grid points without regard to magnitude. At low latitude, that means that the USNO SA2.0 looks like 'graph paper', and it's nearly impossible to see patterns, either for a human or a machine. Therefore I prepared a compression of the USNO by cutting out everything at dec < -45 degrees, and then sifting the remainder by going through 1 square degree at a time and saving the brightest 3000 stars in each square degree. (The r magnitude was used for this; because of the magnitudes are only given to 0.1 mag, I actually cut at a magnitude limit so that 3000 or fewer were kept.) This preserves nearly all the stars at high latitude, but only the brighter objects at low latitude. For the present application, it performs about as well as the full USNO. The compressed version is about 1.15 Gbyte, while the full version is about 6.5 Gbyte, which isn't completely trivial even with today's gigantic disks.

Why matching fails: I've tested this suite of software extensively, and it works pretty well. But there are lots of ways for it to fail. Here's a checklist:

If at least four matches are found, the matchusno2 program writes a file called imagename.radec.match.wcs. Each line of this file contains information for a single matched-up star. There's a running number, followed by the x and y pixel coordinates, followed by the RA and dec from the USNO (or later UCAC2) in 'colon-ized' format. This is the main product of the match procedure, and it will be used to feed the IRAF WCS coordinate loader in the next step.

Installing the WCS solution in the image headers. In single-picture mode, the matchusno2 program queries the user whether to accept the match it has found, and if the user approves it writes a file wcs_script1 which then is executed automatically. loading the WCS into the image headers. If you reject a fit, an empty script is written which will fail to execute, so the program won't mistakenly use an old, incorrect script and write the wrong WCS.

For batch processing, checkwcs2 is an intelligent script-writer for this step. You need to run this one separately, but the makematch.py script reminds you to do so as it finishes. checkwcs2 takes as its first argument the list of filenames, and an optional second argument suppies a number giving a rotation angle tolerance in degrees, for example

    checkwcs  imagelist  1.

For each image name, it looks for the .radec.match.wcs file; if it finds it, it reads it, and if it's not found, it writes an entry in a file called unmatched_pix. Using the data from the match file, it re-generates the fit and figures out the parity and rotation of the image. Next it reads the .hsel file. Using NAXIS1 and NAXIS2 values, it finds the xy coordinates of the image center. It then translates these into the exact RA and dec of the image center using the coordinate fit.

Once checkwcs2 has digested the information for all the images, it makes the following two safety checks:

Images which fail the parity or rotation test are appeneded to the unmatched_pix file.

If very few images are included in the image list, the statistical tests aren't done -- in that case a little summary is printed out for a sanity check. If an image comes out at a crazy angle or something, better not use that fit.

checkwcs writes a cl script to the file wcs_script which will do the following for each image with a good solution:

[Note that checkwcs2 can also work for a single image, in which case you type checkwsc2 imagename s. The second argument s tells the program to treat the first argument as the name of an image rather than a list of image names, and suppresses the statistical checks. The program instead simply reads the imagename.radec.match.wcs file, fits it, presents the fit to the user, and asks for approval, just as in the last part of matchusno in single-picture mode. You then have to type ./wcs_script1 as a separate step to load the coordinates. Ordinarily this won't be used much.]

Checking the results. checkwcs2 spews out some hopefully informative output. You can go through the unmatched_pix file to see what may have gone wrong. checkwcs also produces a more detailed file, wcsaudit, which has capsule summaries of each fit (including the number of stars used, the rotation angle, the parity, and the rms). Because both the unmatched_pix file and the wcsaudit file are appended to, rather than created anew each time (as is the wcs_script), the output is time-stamped. That way, if you rerun checkwcs, you can figure out which run of the program started where. Important pictures which didn't match may take some hand work -- see the end of this document.

Even Gorier Details

If you want to recompile the code, please make your own copy! If you're contemplating this, I assume you have some idea what you're doing. The codes are in /usr/local/pkg/thor. They link with various things; the makexxx files are not traditional `makefiles' but just executables which compile and link everything which is needed. Utility packages skycalc and numrec are called for, but you shouldn't have to mess with them.

Surgery on the guts of the code is likely to be bloody and painful. There are great swaths adapted from other applications in which the variable names are obscure, etc.; the 'double transformation' of going from RA and dec to eta and xi, and then to X and Y, makes for a rather messy business. If you run into some problem I may be able to help (and I'd like to make the code useful to as many as possible); john dot thorstensen at dartmouth dot edu.

If you want to install the procedures on a machine at home, you'll need the USNO (or better, my reduction of it), PGPLOT for the handmatchusno stuff, and (of course) my various source-code stuff. It shouldn't be hard to do if you're a guru.

On the less esoteric side, you can alter the action of SExtractor by copying the

/usr/local/pkg/thor/sexdef/starfind.sex
file to your local directory, renaming it default.sex, and taking out the -c blahblah flag from the calls to sex.

Some uses for the WCS

The WCS stuff is highly integrated into a number of packages. This just scratches the surface.

If you display an image in native DS9 (tutorial intro here, home page here), using DS9's File menu, not from within IRAF, then DS9 knows knows about the WCS and will display the RA and Dec of the cursor as you move around. The align wcs option under the Zoom menu will rotate the image to put true north exactly at the top. It's also possible to overlay the RA/dec coordinate grid on the image. This is all very cool.

DS9 has another remarkable use for the WCS. If you display a WCS'd image, and look under the Analysis menu at DSS server, you can automagically call up a Digital Sky Survey (DSS) image of the same area of the sky from the Web (either from CfA or STScI).

It gets even better!! Under the Frame menu, the match frames menu has a WCS option, which will automatically align all the other images with the image you're displaying. Then selecting Frame and blink will blink them -- so within a few seconds you can be blinking your recently-taken image against the DSS!

If you display an image from within IRAF on ximtool or DS9, you can run imexamine, and then from within imexamine you type the colon commands

: set wcs world
: set xformat \%15.3H
: set yformat \%14.2d
-- subsequent invocations of c and a will give RA and dec along with X and Y.

Either of these options can be used for a reality check -- just graph the USNO stars (perhaps with handmatchusno, display the image with the wcs, and see if the stars' RAs and decs correspond correctly.

The stsdas package for IRAF has some WCS-savvy tools. The imgtools commands under toolbox include rd2xy and xy2rd, wihch do quick conversions (as their names suggest).

The generic IRAF image package, under immatch, has tools wregister and sregister which can be used to align a set of images onto a common grid using the WCS; they can then be mosaicked or averaged. This is a stupendously inefficient way to mosaic images (as far as computer resources go), but it does work. These tasks can be used to register dithered images for stacking; this is a lot easier than laboriously compiling lists of XY shifts for the images you wish to stack.

The USNO A2.0 at MDM

As noted earlier, there's a customized 1.2 Gbyte version of the USNO A2.0 on the Linux boxes at MDM. See the earlier text for details of how this subset was selected. There are a couple of utilities to use it:

Note that the USNO has a hole around the Galactic center, and in a few other parts of the sky which was burned out on the original POSS-I plates. Also note that the plates were taken in the 1950s and a (very) few objects have moved significantly since then. Plates north of dec = -15 were POSS-I and quasi-simultaneous; plates south of there are UK Schmidt, are more recent, and were not simultaneous, so variability can create spurious colors. See the USNO website for greater detail.

There'a hook in handmatchusno which allows you to substitute a faked USNO file for the USNO on the disk. This is helpful in those parts of the sky where the USNO is burned out or otherwise useless. You need to put your star list into the form hh mm ss dd mm ss mag mag. The GSC 1.2 can be useful, but needs to be put into this format. Downloads of the full USNO from the Lowell website are already in this form. See the end of this document for more detail.

Thorstensen's astrometry programs

There are a number of homegrown programs which make use of the match products.

Fixing Things By Hand

Sometimes things just don't go right and you have to do something manually. Sometimes it's a matter of fixing up the catalog or hsel file and re-running the automatic matchusno by hand. Other times you have to intervene more closely -- for that, I've come up with a power tool for this called handmatchusno. First, a few preliminaries.

Displaying and marking images. It's good to know about the IRAF tvmark command -- with this, you can mark on the display the stars which are listed in a *.cat file. To invoke it, pop up an ximtool or ds9 window, and from within IRAF display the image with

display imagename 1
and mark the image with
tvmark 1 imagename.cat
Useful tvmark parameters are mark (I like `circle'), radii (5 is good), and color (205 = green, 204 = red, 206 = blue -- isn't it obvious?).

Making your own detections catalog file. Sometimes if you display and tvmark an image, it'll be obvious that the cat2seeing selection has gone haywire. If you can come up with a decent star catalog, you can try matchusno2 (the automatic version) again. One easy way to fix this problem might be to adopt an alternate program, like daofind, which requires more input from the user than SExtractor but which works reasonably well, though it will have trouble with saturated stars and requires some parameter tuning for best results. Just rename the output imagename.cat so matchusno2 will see it.

Another strategy is to filter the output from SExtractor manually. [Note to self: It would be good to write a cursor-driven program to do this interactively.] Start by re-generating the test.cat file, by typing

sex -c /usr/local/pkg/thor/sexdef/starfind.sex  imagename.fits
You can sort the test.cat file using the native Linux sort command. If you wanted (say) to sort in order of the numerical value in the 3rd column, and put the result in a file called temp1, you'd type
sort -n -k 3 test.cat > temp1
Note that the columns in the test.cat file are such useful diagnostics as FWHM and magnitude, so you may be able rather easily to create a file consisting of mostly stars, by repeatedly sorting the file and cutting away the unwanted entries with a text editor.

When you're happy, overwrite the bad auto-generated catalog and make your hand-crafted one the 'official' one, for example

mv temp4 imagename.cat
Then try to match it:
matchusno2 imagename s
(The .hsel file will be needed by this step). The s on the command line tells matchusno2 to work in single-image mode, that is, to present details of the fit and write a wcs script if the fit is satisfactory.

If you're reasonably sure of the fit, apply the wcs to the image by typing

./wcs_script1

Fixing Bad Matches or Bad Image Scale with handmatchusno

The most stubborn problems occur when the catalog file is good, and the header information is good, but the program somehow gets stuck in an incorrect match. Or perhaps nothing matches because the image scale in the header is wrong, and you don't know what the right value is, so you can't just edit it into the headers. Then you have to figure out the match yourself. The semi-automated program handmatchusno makes this less awkward.

The code will work on its own, but if the .cat and hsel files are available and good, it can make use of them. If the automatic match has failed, these files should exist, so handmatchusno can come along and clean up. But even if they don't exist, it will work with more effort on your part. Start by displaying your image from IRAF, as described earlier. If a *.cat file exists, tvmark the display for reference, and to verify that it's good. If there is no *.cat file, or if it's bad, you may want to make your own as discussed earlier, but it's not absolutely needed.

Invoke the program from an xterminal by typing

handmatchusno imagename [seed-imagename]
where as usual imagename is the image file name (any .fits extension is stripped off internally). The seed-imagename argument is optional. If provided, it will be assumed to be the name of an image taken with the same setup for which a reliable match has already been found (i.e., a good *.radec.match.wcs file already exists for this image). Specifically, the pixel scale, parity, and rotation angle must be identical with the image you're trying to fit, but the seed image can be of a different target altogether. To make use of a seed image, there must be a good catalog of star detections available for the image you're trying to fit (i.e., an imagename.cat file).

The program looks for a *.hsel file, and if one is found, it shows you the contents and asks if they're OK. If you say y things proceed; if n (or if it doesn't find the hsel file in the first place), it prompts you for the needed information. The information will be used here only to figure out which piece of the USNO to grab, so you should be reasonably accurate but need not be super-precise. It then asks you if a usable *.cat file exists for this image; answer y or n.

The program looks up the USNO stars in the area and plots them to a graphics window. You're given a cursor, and further commands are cursor-driven:

If you do write out a match file, a wcs loading script is generated, too. Your beautiful hand-crafted fit can now be applied to the image by going into IRAF or pyraf and typing

cl < wcs_script1
from within IRAF. The reason one needs to do this from within IRAF is that handmatchusno has not yet been updated to take advantage of the capabilities of pyraf and python.

Using an imported star file with handmatchusno. In normal operation handmatchusno looks up the catalog stars from the USNO on disk. You can override this and use your own file of catalog stars by giving a third argument on the command line (the first and second being the filename roots of the program frame and a seed calib frame), which is the name of your own file of catalog stars. The format expected is hh mm ss dd mm ss mag1 mag2; anything can be floating point, and the format is otherwise flexible.

There are several possible sources of imported star files; here are a few.