Radio Frequency Interference (RFI) is a major problem in radio astronomy, in which powerful radio transmitters on or near Earth (such as cellphones, microwaves, and satellites) create bursts of noise that overwhelm the astronomical signal. One of the steps in data processing is to detect the parts of the signal that have been corrupted by RFI. Your task is to build a simple RFI detector.
The figure below shows an example, for a range of frequency channels at a single point in time. There is a smoothly varying background shape, which corresponds to the response of the system at various frequencies. but there are also spikes of RFI. The red vertical lines indicate samples that have been flagged as RFI. The values are complex numbers, but only the amplitude is plotted.
There is no perfect way to detect RFI and there are a number of approaches with varying quality of detection. A simple approach you may wish to try is as follows: first, take the amplitude (absolute value) of all the complex numbers to get a real-valued signal. Compute the background, i.e., a smooth version of this signal with the RFI eliminated. Either a median filter or a spline fit is a reasonable choice for this. Subtract the background from the signal to obtain a residual. Any residuals that are particularly large relative to the rest probably indicate RFI. The median of the absolute values of the residuals will give a benchmark against which to decide what is "particularly large".
If you have spare time then you can investigate more accurate methods, but make sure you have everything fully implemented, working, and documented before you move on to this.
The data is provided in files named rfi1.h5
through rfi3.h5
. These are in the HDF5 file format, which you can read about on the web. The h5py
Python module will be useful for reading data from these files. Each file contains a single dataset called Data
, which is a 1D array of complex values.
Insert your code below this section. You must produce a graph (within the notebook) for each of the input files above, showing the amplitudes of the data and indicating which samples have been flagged. You can use the picture above as a guideline, but you do not need to produce exactly the same style as long as all the relevant information is clearly presented.
%matplotlib inline
# Some useful packages - feel free to change these or add more
import h5py
import numpy as np
import matplotlib.pyplot as plt
# Add your code. Feel free to use multiple notebook cells.
# You are encouraged to show graphs of intermediate results, such as the background.