TrAjectory Based RFI Simulations of radio interferometry data. A source to visibility model for RFI sources including certain near-field effects.
tab-sim is written in JAX and Dask and can therefore use GPUs and/or CPUs to perform larger than memory computation.
https://tab-sim.readthedocs.io/
The following instructions are expected to work on Linux machine. If you are running Windows it is recommended to use WSL. If you are running Mac then it is best to start with a conda/mamba environment and install python-casacore before continuing with the pip install method. If all else fails there is the Docker install.
Firstly you should clone the repository to you machine with:
git clone https://github.com/chrisfinlay/tab-sim.gitYou can install tab-sim with pip alone inside an environment of your choice with optional GPU support.
pip install -e ./tab-sim/[gpu]or
pip install -e ./tab-sim/If you are having trouble with the pip install method you can try with Docker instead. The provided Dockerfile can be used to build an image which should, "in principle", run on any machine.
Assuming you have cloned this git repository into you current working directory then you can either:
- build an image with the latest tab-sim.
- download a working but older version of a tab-sim image.
First we can simplify by setting the TAB_DIR environment variable using
TAB_DIR=$(pwd)TAB_IMG=tab-sim:latest docker build -t ${TAB_IMG} ./tab-sim/ TAB_IMG=chrisjfinlay/tab-sim:0.0.1 docker pull ${TAB_IMG}After running one of the above you can run the docker image using the appropriate command below:
docker run -it -v ${TAB_DIR}:/data -u $(id -u):$(id -g) ${TAB_IMG} bashdocker run -it -v ${TAB_DIR}:/data ${TAB_IMG} bashFor more complex tab-sim installs using docker you can adapt the Dockerfile to your needs.
tab-sim includes the facility to define a simulation using a YAML configuration file. There is a general command line interface to run these simulations allowing one to change certain parameters on the fly as well as in the configuration file. All input data is copied into the output simulation directory to allow one to run an identical simulation with ease. Inside tab-sim/examples/target are a set of config files to get you started. There are also example data files which are used for including predefined astronomical and rfi models. They are all csv files with file extensions to help distinguish them. These reside in tab-sim/data/aux_data.
You will need to provide Space-Track login details as a YAML file. The filename can be spacetrack_login.yaml for example and should look like
username: user@email.com password: password123To run a simulation of a target field with 100 randomly distributed point sources and some GPS satellites simply run
sim-vis -c sim_target_16A.yaml -st spacetrack_login.yamlYou can run the help function to see what other command line options there are.
sim-vis -hThe simulation configuration file has many options with set defaults such that minimal configurations can be set unless more is required.
The base configuration files with defualts and definitions reside in tab-sim/data/config_files
Once a simulation has been run then a directory will be created to store all input and output data used for the simulation. The location for this directory is defined in the simulation config file under
output: path: output_directory_path prefix: simulation_name_prefixThe directory name will have a prefix as defined in the config file and include many of the other simulation configuration parameters in the directory name. For example, when using the target_obs_16A.yaml config file, the name will be pnt_src_obs_16A_450T-0000-0898_1025I_001F-1.227e+09-1.227e+09_100PAST_000GAST_000EAST_3SAT_0GRD_1.0e+00RFI. The resulting directory structure inside this base directory is as follows:
- sim_name/ - sim_name.zarr/ - sim_name.ms/ - AngularSeps.png - SourceAltitude.png - UV.png - log_sim_xxxxxx.txt - input_data/ - MeerKAT.itrf.txt - norad_ids.yaml - norad_satellite.rfimodel - sim_config.yaml The .zarr and .ms files contain the actual visibilities that are simulated with the .zarr file containing intermediate values used in calculating certain quantities. The .png files are diagnostoc plots to check the angular separations between the RFI sources and the target direction, The source altitude of the target direction, and the UV coverage of the baselines. log_sim_xxxxxx.txt contains the output that was displayed when originally running the simulation. Finally, input_data contains all the required data to rerun the simulation exactly for reproducibility especially if the data hungry visibilities need to be deleted for some reason.
The .zarr files is most easily read using Xarray in a Jupyter notebook. The following code will read the simulation data into an xarray.Dataset object.
import xarray as xr zarr_path = "path/to/data_dir/sim_name/sim_name.zarr/" xds = xr.open_zarr(zarr_path) xdsThe structure of the .zarr file is as follows:
| Name | Description |
|---|---|
ant | Antenna index |
bl | Baseline index |
enu | East, North Up {m} |
freq | Frequencies {Hz} |
geo | Latitude, Longitude, Elevation {deg, deg, m} |
itrf | International Terrestrial Reference Frame (ECEF) {m} |
lmn | Local astronomcal cosine coordinates |
radec | Right Ascension, Declination {deg} |
time | Elapsed observation time centroids {s} |
time_fine | Fine grain time centroids {s} |
time_mjd | Modified Julian Date time {days} |
time_mjd_fine | Fine grain Modified Julian Date time {days} |
tle | Two-line elements for satellites |
uvw | Local antenna coordinates {m} |
xyz | Geocentric Celestial Reference Frame (ECI) {m} |
| Name | Description |
|---|---|
SEFD | System Equivalent Flux Density {Jy} |
antenna1 | Antenna 1 index |
antenna2 | Antenna 2 index |
ants_itrf | Antenna ITRF coordinates {m} |
ants_uvw | Antenna UVW coordinates {m} |
ants_xyz | Antenna XYZ (ECI) coorindates {m} |
ast_p_I | Astronomical point source intensities {Jy} |
ast_p_lmn | Astronomical point source positions |
bl_uvw | Baseline UVW coordinates {m} |
flags | RFI flags based on 3sigma from truth |
gains_ants | Antenna gains |
noise_data | Visibility noise realisation {Jy} |
noise_std | Visibility noise standard deviation {Jy} |
norad_ids | NORAD IDs for TLE-based satellites |
rfi_tle_sat_A | Modulated satellite signal amplitudes {Jy^0.5} |
rfi_tle_sat_ang_sep | Satellite angular separation from target direction {deg} |
rfi_tle_sat_orbit | Satellite TLE orbit parameters |
rfi_tle_sat_xyz | Satellite XYZ (ECI) coordinates {m} |
time_idx | Time index to map from time_fine to time |
vis_ast | Astronomical visibility component {Jy} |
vis_calibrated | Perfectlty calibrated visibilities {Jy} |
vis_obs | Observed (uncalibrated) visibilities {Jy} |
vis_rfi | RFI visibility component {Jy} |
| Name | Description |
|---|---|
chan_width | Frequency channel bandwidth {Hz} |
dish_diameter | Dish diameter {m} |
int_time | Integration time per sample {s} |
n_ant | Number of antennas |
n_ast_e_src | Number of astronomical exponential profile sources |
n_ast_g_src | Number of astronomical Gaussian profile sources |
n_ast_p_src | Number of astronomical point sources |
n_ast_src | Number of astronomical sources n_ast_e_src + n_ast_g_src + n_ast_p_src |
n_bl | Number of baselines |
n_freq | Number of frequency channels |
n_int_samples | Number of integration samples per time sample |
n_sat_src | Number of satellite RFI sources |
n_stat_src | Number of stationary RFI sources |
n_time | Number of time steps |
n_time_fine | Number of fine grained time steps |
target_dec | Declination of the target direction {deg} |
target_name | Name of the target |
target_ra | Right Ascension of the target direction {deg} |
tel_elevation | Telescope elevation {m} |
tel_latitude | Telescope latitude {deg} |
tel_longitude | Telescope logitude {deg} |
tel_name | Telescope name |
Measurement sets allow the addition of non-standard data columns. The simulator in tab-sim takes advantage of this and adds the following columns to help with debugging and analysis.
DATA: Observed data which includes gains and noise.CORRECTED_DATA: Filled with zeros or the data of ones choice when calling thewrite_msfunction.MODEL_DATA: Filled with zeros as it will be used byWSCLEANwhen imaging.
CAL_DATA: Observed data (DATA) where the true gain solutions have been applied.AST_MODEL_DATA: The astronomical visibilities only with perfect gains and no noise.RFI_MODEL_DATA: The RFI visibilities only with perfect gains and no noise.AST_DATA: The same asAST_MODEL_DATAbut with the noise added.RFI_DATA: The same asRFI_MODEL_DATAbut with the noise added.NOISE_DATA: The complex noise that is added to the above datasets.
@ARTICLE{Finlay2023, author = {{Finlay}, Chris and {Bassett}, Bruce A. and {Kunz}, Martin and {Oozeer}, Nadeem}, title = "{Trajectory-based RFI subtraction and calibration for radio interferometry}", journal = {Monthly Notices of the Royal Astronomical Society}, year = 2023, month = sep, volume = {524}, number = {3}, pages = {3231-3251}, doi = {10.1093/mnras/stad1979}, archivePrefix = {arXiv}, eprint = {2301.04188}, }@ARTICLE{Finlay2025, author = {{Finlay}, Chris and {Bassett}, Bruce A. and {Kunz}, Martin and {Oozeer}, Nadeem}, title = "{TABASCAL II: Removing Multi-Satellite Interference from Point-Source Radio Astronomy Observations}", journal = {arXiv e-prints}, year = 2025, month = jan, doi = {10.48550/arXiv.2502.00106}, archivePrefix = {arXiv}, eprint = {2502.00106}, }