April 2003

A better way to search for radio signals from extra-terrestrial intelligences is to look for PRN-encoded signals. Such encoding techniques offer powerful signal-to-noise improvements, making communication far, far more practical than using unencoded radio signals. This search technique will prove practical in the coming decades as Moore's Law provides the CPU cycles needed for such a search. ETI know Moore's Law, and depend on it to become intelligible.

*Caution: This is an informal essay, a draft; conclusions in
different parts of this text may contradict themselves.
This is 'thinking aloud'/doodling with ideas.
*

The PRN technique is used by the GPS (Global Positioning System), some cell radios, and by covert (spy) agencies. Pseudo-Random Noise phase-shift-keying has some very desirable noise-suppression properties that make it a natural candidate for use for communication by ETI. It allows low power transmissions to cut through noise, thus reducing the amount of power that a transmitter needs to be heard. For example, the GPS encoding system allows GPS satellites to transmit at power levels 30 dB below what they might 'otherwise' have to transmit. This is a huge amount of savings: three orders of magnitude. Of course, nothing is free: the penalty one pays is vastly reduced (three orders of magnitude) data rate. The GPS system has a 50-bits-per-second data rate, despite chewing up about 5 MHz of bandwidth. But that can be OK: 50 BPS is adequate for GPS, and I would imagine would be more than adequate for ETI transmissions.

I cannot over-emphasize the noise-suppression characteristics of pseudo-random encoding. The GPS encoding allows the signal level at the receiving antenna to be 30dB below the thermal noise figure at the antenna. That is, if you used a simple amplifier hooked up to a GPS antenna, all you would see, if you looked at the power level, would be pure, unadulterated thermal noise. Yet, there's a signal in there, and its 30dB down. To the un-initiated, this sounds impossible, a contradiction, free energy. But all GPS antennas operate in exactly this way, and the system was designed to work this way on purpose. There is an easy calculation one can perform: Computing the thermal noise figure for a transistor at room temperature for a 5MHz bandwidth. Its about -105dBm. You can look up the SIS (signal-in-space) figures from the GPS spec, ICD-200C, and see that its about -130 dB. The fact that such a weak signal is receivable is an offshoot of the Shannon-Hartley theorem, which relates channel capacity to bandwidth in the presence of noise.

It is in fact this property of being under the thermal noise floor that makes PRN modulation such an appealing thing for spy agencies. If a spy transmits a signal using this technique, and the counter-spy does not know the secret encoding key, the signal cannot be received and decoded. (Well, it can in, theory, but in practice it becomes very very hard). The encoding keys can be arbitrarily long and cryptographically secure, preventing the counter-spies from detecting the transmission. Indeed, a variant of this is used by the military to encode the GPS P/Y signal, preventing non-military users from getting the more accurate positions that the system is capable of providing.

It is outside of the scope of this essay to explain how PRN encoding works; it is carefully detailed in textbooks and papers. However, a brief description is in order. One starts by choosing a 'good, random' string of bits to act as the encryption key. In the case of GPS, this key is 1023 bits long. The key has nearly equal numbers of ones and zeros in it, and the distribution of pairs of ones, pairs of zeros, etc. falls off rapidly, as it would in a random (white-noise) spectrum. Each GPS satellite has its own unique key; the keys are published and well known and serve to identify the transmitting satellite. The signal is transmitted by repeating this key over and over. For GPS, each bit is slightly less than a microsecond, so that the pattern repeats every millisecond exactly, for a 1.023 MHz 'carrier' (which is then used to PSK the 1.575GHz radio signal). To modulate the 'carrier', the ones and zeros in the pattern are inverted. For GPS, this is done every 20 repeats (for a 50 BPS signal) and in WAAS, every 2 repeats. To find a signal, one searches for the particular pattern of bits, by convoluting the received radio signal with the known key of a given GPS satellite. The integration represented by the convolution will have the effect of smoothing out all noise that is not an exact match for transmitted pattern. The fact that the integration takes place over 1023 bits is what gives the 30 dB process gain (although even more gain can be gotten by integrating over the 20 millisecond bit period; but this is descending into the details.) Note, however, how this technique 'wastes bandwidth': a 1.023MHz carrier is used to encode a 50 (500 for WAAS) bit-per-second signal. If this were plain-old AM or FM radio, one would need only 50 Hz to transmit a 50BPS signal. Three orders of magnitude of bandwidth use are traded for three orders of magnitude of signal-to-noise suppression. It is the use of a 'pseudo-random' key that provides the conversion of bandwidth to SNR process gain.

ETI are well aware of the generic theory of PRN-based PSK encoding schemes. No doubt, they know of some we do not. It is important to stress that PRN encoding is a very 'intellectually natural' way to encode a signal, much as AM radio seems 'obvious' and 'natural' to the beginning radio student. There is nothing about PRN schemes that smells of 'human nature', something bizarre or freaky that only humans want, need or would invent. The cryptographic theory behind PRN is a natural subject in mathematics, and finds pervasive application in computing. It would be as natural to ETI as it is for us.

Of course, if an ETI was broadcasting a signal that it wanted to be heard, it would be pointless to use an excessively long encoding key, since doing so would make it very hard for us to detect the signal. But this is exactly where things get interesting. Short keys are much easier to detect, but offer less noise-suppression. In fact, the process gain is just the logarithm of the length of the repeating pseudo-random noise sequence. GPS uses a 1023-bit long PRN sequence, which is what provides the 30dB of process gain. But long keys are hard to detect. That's because the detection algorithm requires a convolution of the bit-sequence with the radio signal. The convolution will yield zero (or rather, a Gaussian noise distribution) unless the bit-sequence is within about half-a-bit of alignment with the signal, at which point the convolution strength jumps dramatically. Thus, to pick out a 1023-bit pattern, one must go through three steps:

- Know what pattern to use, to look for. There is a fairly limited number of 1023-bit patterns that have the desirable noise properties; GPS uses 37 of these, plus another 20 for WAAS; some more are reserved for EGNOS. To find a GPS satellite, you try each pattern in turn.
- One must try every correlation possibility: one must compute the convolution for each of 1023 shifts, and preferably two times that, to pick something out.
- Because the GPS satellites are moving, one must try a variety of different Doppler shifts.

But all this is doable, especially if you have the worlds largest supercomputer. Put yourself in the shoes of an advanced civilization that would be capable of mounting a project to transmit a strong signal. Presumably, they would be technologically advanced by many thousands of years over our own. Presumably, they might still know Moore's law to be true (or not). If we imagine the computing resources that humanity may have in 30 or 100 years, searching for a PRN needle in a thermal-noise haystack might be entirely reasonable. We know this, and, importantly, the ETI know this too. They would use this technology to provide process gain of many orders of magnitude to cut through the clutter of natural phenomena, and to cut through terrestrial interference. They know that we posses (or will soon posses) the computing power needed to find the signal. They know that it is far, far cheaper and easier to gain orders of magnitude of computing power than to build radio dishes that are orders of magnitude larger (or to build radio transmitters that are orders of magnitude more powerful. ETI may not need to build a terawatt transmitter, far less may suffice. They know this, and we know this, and thus it is natural that they would solve the problem through computing brute-force rather than through amplifier brute force. They know that computing brute-force will be a lot cheaper for us than radio-reception brute force is: they'll choose the economically viable solution.

I should also point out that once one has detected a signal, once one has locked onto a signal, once one knows the key, tracking it is easy. Once one knows the key, the carrier frequency, and the phase offset, tracking is nearly trivial. The only hard part is to try all of the different possibilities and combinations. The only hard part is the search: the actual reception, once the signal has been found, is easy. This is really the key concept being touted here: The search of PRN signals is really hard, because of the combinatorial explosion. However, the use of PRN makes possible extremely long integration periods and thus a tremendous amount of noise suppression. This, in turn, makes it possible for the transmitting civilization to use relatively low power and even (gasp!) an omni-directional antenna! It overcomes the question of whether the transmitting antenna is aimed at us, of whether we were looking while they were aiming at us.

I should point out that not all astronomical noise is 'white noise'. As the SET@Home project ably shows, there are vast amounts of pseudo-CW noise. In the SETI@Home terminology, a possible detection of a narrow-band continuous wave signal is a 'Gaussian'. They have found many, many transient Gaussians. I call them transient, because when one looks again at the same patch of sky, one almost always doesn't see them again. The origin of these 'Gaussians' is unknown. If they have a purely statistical distribution, then they are just the normal tail-end of statistically distributed noise. If the distribution of these 'false positives' is beyond the expected random distribution, then one has to look for a serious explanation. Besides ETI, they may be from astronomical sources (novas, neutron stars, etc.) or they may be of a particularly subtle form of terrestrial interference. From this experience, I think it might be appropriate to conclude that the cosmos is littered with naturally occurring, narrow-band, CW 'noise', giving searchers such as SETI@Home lots of false-positive hits. By contrast, a PRN-modulated signal would cut through this clutter; it would almost certainly not be naturally occurring. While one might be able to imagine a natural process that radiates narrow-band CW, it is much harder to imagine one that radiates PRN. Such a natural source would have to be governed by some sort of chaotic attractor that happens to have the signature of a fairly complex generating boolean polynomial. It seems unlikely. So this is the point: PRN can cut through not only white-noise clutter, but also through CW clutter. With the exception of GPS and spy-agency use, it should also cut through virtually all man-made interference.

Don't be mislead by the above paragraph. The search for PRN signals will also have a combinatorially large explosion of 'false-positives'. Because white noise is uncorrelated in the time domain, by exploring a combinatorially large space of possible signals, one will find a huge number of combinations where the white noise just happened to add instead of cancelling. The search for PRN is very hard. It's 'good' properties emerge only after the signal is found.

Finally, one should note that ETI use of PRN sequences easily explains the negative results seen so far by other SETI programs (e.g. the Harvard BETA Project). If a carrier were PSK modulated at a chipping rate of a few megahertz (for example), then the 1/2 second integration time used by the BETA project spectrometers would completely wipe out the signal. To detect a PRN signal, one needs not only to perform quadrature over long periods of time, but one needs to perform it while superimposing the same exact PRN sequence as the transmitter. Any other sequence, or a mismatch of the sequence, just wipes out the signal.

Thus, based on these arguments, I believe that it is of utmost importance to begin attempts to search for these types of signals. The search could be considerably more complex, and take much much longer than searches made so far. But in a different sense, it seems to have a much, much higher probability of success. It is far more natural. I would love to participate in such an effort.

- SETI@Home is a project run by professional astronomers to search for ETI signals. It uses software running on millions of home computers to search for continuous-wave signals possibly received by the Aricebo dish.
- The SETI Institute is primarily an academic institution engaged in the search for and study of extra-terrestrial life.
- The SETI League is an organization of primarily Ham Radio operators using traditional ham radio techniques to search for ETI signals.
- Project Argus: What We've Heard So Far provides a photo-gallery of spectrum analyzer screen-shots receiving various signals.
- Harvard SETI Projects search for narrow-band continuous wave signals by using massively parallel spectrum analyzers.

- The GPS Signal-in-Space strength is below the thermal noise threshold of a GPS antenna only if one is using an omni-directional antenna, as is needed to receive the full constellation of satellites. If instead, one uses a dish aimed at a satellite, together with a narrow-band receiver, then the gain of using a dish will pull the raw modulated energy out of the noise, and make it quite visible in FFT graphs/spectrum analyzers. Remember: thermal noise in a narrow-band receiver is much, much less than in a wideband receiver. To actually use GPS, one must use a wideband receiver.
- It is possible to detect very very weak CW signals by
integrating over very very long periods of time. The
longer one integrates, the more noise suppression one
achieves. This is the 'basic science' that allows weak
signals to be detected, and underlies the SETI@Home search.
Because integrating a PRN signal is a lot like integrating
a CW signal, one might naively think that the use of PRN
modulation provides absolutely no advantage over a pure
CW signal. Naively, PRN seems to only make detection
harder, not easier. Naively, one would be right,
*if*one could assume that all noise sources were pure, non-correlated white noise. But noise sources are not pure white noise. There is lots of man-made clutter that is very non-random, non-white-noise. A lot of stellar noise is likely to be shot noise, which has a non-trivial autocorrelation. The use of PRN provides an additional filtering that allows this clutter to be eliminated, to eliminate all the false-positives that a narrow-band search is likely to lock onto. - The actual "best" communications encoding one could use for intra-galactic communications should be chosen in the light of a good model of galactic and local noise. In statistics, 'Bertrand's Paradox' illustrates that one cannot make meaningful statistical statements without having an underlying model of the physical processes. This applies to galactic noise as well as lines and circles. By having a good model, one can then devise a good code that can effectively suppress that noise. It is reasonable to assume that ETI has such a model, and that they are using an appropriate code. It is therefore appropriate to try to come up with the same model, and then divine the communications scheme. It is the conceit of this article the PRN-based phase- or frequency-shift keying is that scheme, although this is a leap.

However, this 200dB gap can be bridged: if the transmitter were more powerful, if the transmitting antenna were more focused, if the transmitter were closer, and if the receiving antenna was larger and used modern space-communications electronics, then the gap between thermal noise in the receiving antenna and the received signal could drop to some 60 dB, which could be easily achievable with PRN sequences in the million-chip range, and a communications data rate of about 1 bit per second. To find which million-chip sequence was being used, and at what carrier frequency, is conceivable using todays technology. In fact, I believe that one could use technology that is available today, purchased with entirely reasonable civilian, non-governmental budgets, and communicate effectively with the nearest stars, were there anybody there to communicate with. To back this claim, more accurate estimates of the various gains & losses are needed.

The above discussion begs the question of why anyone would transmit such a signal towards earth. There are two important classes of transmitters: those who intentionally transmit towards us, and those whose signal accidentally just happens to splash onto the earth. The intentional transmitters are presumably those who have made a sky survey, identified the nearest, most-likely habitable solar systems, and are beaming messages in such a fashion as to make it easy for us to detect them. The second class would be those who are using this technique to communicate with someone else, and the earth just happens to be grazed by their signal. We should presume that such a signal would be intentionally difficult for us to receive and decode, and would be encrypted. An interesting tell-tale would be, though, another signal coming from the opposite direction of the sky!

Needed Estimates:

- Thermal noise is proportional to receiver bandwidth, but the receiver bandwidth must equal or be better than the transmitter chipping rate. The chipping rate is in turn determined by the desired communications data rate, and the desired PRN length. The PRN length in turn determines how much thermal noise can be overcome. We need a graph of effective gain over thermal noise as a function of the communications rate and the PRN sequence length. (Answer: its flat, explain why).
- For a given PRN length, how many different sequences are there that would have good communications properties (i.e. have good noise distributions)? Of these, how many could be generated with short boolean polynomial generators? These numbers determine how hard the search will be: We need to try the various different polynomials, and we need to try each different chip phase. In other words, we need a graph of search difficulty as a function of PRN sequence length, given that the actual PRN sequence is unknown. (Note: the only 'interesting' PRN lengths are those with N being a large prime, as these are the ones that spread out the spectrum the most. By contrast, if N is a product of small primes, then many/most spectral lines are absent.)
- What is the range of frequencies we want to search? The obvious range is the 'water hole', 1.4GHz to 1.7 GHz. We need to try to fit each PRN pattern at each frequency band. The size of a frequency band is given by the chipping rate. Since we don't know the chipping rate, we would need to try many of these, thus searching many overlapping bands. What is a reasonable search algorithm through these overlapping bands? (A 'reasonable' algorithm would detect at least some energy even if it was not dead-on in chip rate or carrier frequency). We need a graph showing search complexity as a function of chipping rate.
- Distant sources are likely to be receding with considerable (apparent) accelerations, and thus its likely that the received signal will be 'chirped'. Find the max astronomical chirp rate. Discuss the search strategy in face of chirp rate. Note that GPS avoids the chirp-rate question by using fairly short integration periods, and I & Q signals to feed a PLL which can lock onto and track changing carrier frequencies. Define the longest possible integration time before chirp rate become significant.
- The number of false-positives increases in direct proportion to the number of searches made. This is because of the uncorrelated nature of white noise. If one integrates for a long time at one frequency, on average the noise will cancel out. But if one integrates a long time with a million different possible phase combinations at that frequency, one will have almost a million cases where the noise cancelled out on average, and a handful where, statistically, the noise failed to cancel out. These are the false positives. We need a formula that expresses the statistical distribution of the false positives. This is just the formula for false positives for integrating a noisy CW signal, times the combinatorial explosion of search possibilities. In short, the use of PRN modulation does not make searching easier; the number of false positives explodes with the same combinatorial vengeance as the search space.
- We need a graph that puts the above together: the search complexity as a function of desired chipping rate and communications rate (minimizing noise/maximizing process gain).
- Next, we need to work backwards from the above data: assuming a given search complexity, how far away, and how powerful, would the transmitter need to be, for us to be able to receive it & find it?
- Practical considerations. Antennas, pre-amplifiers and down-converters all affect operation and provide gain in various ways. These need to be characterized ... Gain from use of modern-day, typical 2-meter or 10-meter dish antennas. Noise figure of preamp. Gain from down-conversion .. what is it -- 30 dB for 1.5 GHz to 150 MHz down-conversion. etc. etc.

Note that a modified fast-Fourier technique can be used to quickly search multiple PSK combinations. This insight implies that it searching for FSK (frequency shift-keyed) signal is roughly as hard as PSK, since the FFT provides the energy bins which must be summed in every possible combination to find the FSK pattern. Note that the large number of combinations makes this seem like an NP problem, but I think a simple hill-climbing search can be pretty good. First one looks for a 'hill' on a fairly short integration period. One then needs only to compare the 'hill' to where the next repeat of the PRN pattern. If there's no hill there, then nothing, and no signal was detected. If there is one, then one needs to examine a few nearest neighbors as well. This is no longer an NP search, this is a polynomial-time search. Right? Or did I confuse myself?

The 300MHz bandwidth is the approximate width of the water window. Arguments developed in later sections indicate that a PRN signal is likely to occupy the entire width of the window, thus motivating the use of this number for the noise calculations.

- Background sky brightness in the water window is approx
1E-20 Watts / (meter
^{2}steradian Hz) and consists of roughly equal galactic synchrotron & cosmic 2.7K background. The synchrotron power varies by a factor of 100 from in-plane to pole views. - Sky Radio Brightness at 20, 40 and 80 GHz
- Formula for blackbody radiation, for power impinging on
a unit area, solid angle, frequency is
P df dOmega dA = 2h (f
^{3}/c^{2}) df dOmega dA / (e^{hf/kT}-1). For T=2.7K, f=1.5GHz we get hf/kT = 2.7e-2 and P = 1.8E-21 Watts/(Hz sr m^{2}) - Overview of Blackbody Radiation, including simple derivation.
- Black Body Radiation, (mirror) a detailed derivation from quantum principles.
- The
total amount of Johnson noise in the first antenna pre-amp
transistor
is given by Boltzmann's constant, which is
k = 1.38 x 1e-23 Joules/Kelvin. At a room temperature of 290K,
this corresponds to a thermal noise power of
P = -173.98 dBm + 10 log
_{10}(frequency in Hz), where dBm refers to the decibel power level referenced to milliWatts. For an antenna held a liquid-nitrogen temperatures, the noise can held down to -179 dBm + ... - 2.3 meter dish having gain of 38 dBi, beam width of 7.0 Degrees. (At what frequency? not 1.5 GHz!)
- Agassiz Station 26 meter dish, effective aperture of 239 meters squared. (What is the dBi for this antenna?)
- Assume a 300MHz bandwidth and a room-temperature receiver.
The Johnson (thermal) noise energy at the input to the first antenna
preamp is then -174 + 85 dBm = -89 dBm. This compares roughly
to the background sky brightness for 2.3 meter dish
which is about 4m
^{2}. 1E-2 ster . 300MHz . 1E-20 Watts/(m^{2}sr Hz) = 1E-13 Watts = -100 dBm . All dishes will see about the background sky brightness, because the larger collecting area of a large dish is offset by its narrower beam. Both sources of noise vary linearly with the bandwidth. For small, cheap antennas, we can approximate the total noise by the thermal noise. For slightly better antennas (e.g. cooled), the dominant background becomes the CMB/Galactic background. The point being that both are roughly comparable in magnitude; and ETI would use the CMB/Galactic background as the noise floor for receiver calculations. Hmmmm ...

- N = PRN Sequence Length
- f = Chip Rate (number of PRN chips transmitted per second)
- W = Carrier frequency (1.4 to 1.7 GHz)
- R = Data transmission bitrate = f/N
- B
_{chip}= Bandwidth of the PRN signal = 4f (approximately) - A
_{r}= Receiver aperture - t = Integration time
- T
_{noise}= Antenna temperature

- R = f/N. We assume that the PRN sequence will be modulated by the data at this rate, a rather natural rate for PRN modulation.
- B = 4f (approximately). PRN PSK modulation splatters energy over a broad frequency band, this range encompasses most of the energy.
- W
_{delta}= search step size = (1/4) f/N = R/4. This is the optimal step size between searched frequencies. This assumes that the integration period is N/f which is the length of one data bit. Over the course of the integration, the pseudo-phase must less than 90 degrees (one-fourth of a full cycle), in order for the signal to not be wiped out. Ergo, the one-fourth in the step size.

C = B log

where C is the channel capacity, and B the bandwidth. The actual achievable data rate R must be less than C; to approach C while maintaining a low BER requires arbitrarily 'complex' encoding schemes. Rearranging this formula, we can see that using a very low data rate spread over a very large bandwidth can provide us with the ability to work with very low signal-to-noise ratios. In particular, for a 0.1Hz data rate spread over 400MHz of bandwidth, we can realize 90 dB of process gain; i.e. work with signal-to-noise ratios of 1.0e-9 of -90 dB.

The theorem also makes an important statement about minimum energy
levels. The noise power N = BN_{0} is a product of the
receiver bandwidth times the noise density. The signal power
S = R E_{b} is a product of the actual bit rate times
the energy per bit. Substituting for S and N, and using the
inequality R < C, we get an expression for the minimum energy
per transmitted data bit:

(B/R) (2^{R/B} - 1) < E_{b}/N_{0}

For R << B that we are considering, the above can be expanded
into

ln 2 (1+ (R/2B) ln 2) N_{0} < E_{b}

which defines the minimum energy per transmitted data bit.
If the energy per bit is less than this, there is no way,
using any encoding scheme, to receive the data. This limit
is called the *Shannon Limit*.

If you don't have a background in signalling theory, but know integrals and sine waves and noise, then you can understand the theorem by thinking about a sine wave added to white noise. To pull out the sine-wave signal, one must integrate/convolve with a sine wave. Think about how long one needs to integrate this combination before the integral of the sine-wave part equals/exceeds the integral of the noise part. Note that the inverse of this time period is the channel capacity.

For a room-temperature feedhorn,
ln2 N_{0} = ln2 kT = 2.8E-21 Joules is the minimum
bit energy.
By comparison, the signal received from a one-megawatt transmitter
driving an omni-directional antenna 100 light-years distant
from us will be about 9E-32 Joules/(second
meter^{2}). To bridge this nearly eleven orders of
magnitude, we need to hope that the ETI use a signalling rate
of 0.01 Hz (for two orders of magnitude),
we need a cryogenic feedhorn (for another one-n-a-half orders
of magnitude), the hope that ETI is using a 10 megawatt
transmitter (instead of one) and a directional antenna,
and/or they're closer, and finally, a collecting area in
the tens-of-thousands square meter range. Ugh.
Only then can we expect the energy-per-bit to exceed the
Shannon Limit imposed by thermal Johnson noise in the preamp.
These limits and theoretical considerations apply equally
to a simple, Morse-code-like CW modulation that traditional
SETI searches look for, as it would to a PRN encoding scheme.
PRN modulation does ** not** provide a mechanism to
get around the Shannon Limit.

A large collecting area can beef up the collected power to drown out the Johnson noise. However, roughly 10dB later, the Cosmic Microwave Background (CMB) kicks in, and simply increasing dish size does not change the amount of CMB received. The amount of extra CMB power received by the larger dish is exactly negated by the reduced CMB power due to the sharper focus of the dish. The dish area and the diffraction-limited solid angle vary in inverse proportion, and cancel out in the black-body noise formula. The total noise power is independent of the dish size. The Shannon Limit applies to all sources of noise in a communications channel: it applies to the CMB as well. The best way to minimize CMB in the communications channel is to make the solid angle as small as possible for a given collecting area size, using synthetic aperture/phased array interferometer techniques. A good way to understand this is to imagine building a traditional imaging radio astronomy telescope. Each pixel picks up a chunk of the CMB noise for that part of the sky. Making each pixel smaller (increasing the angular resolution) decreases the amount of sky viewed and thus decreases the CMB received in that pixel. Increasing the collecting area of the telescope makes each pixel brighter, increasing the amount of both signal (if any) and noise collected. If we now imagine that only one of the pixels contains a SETI source, and none of the others do, then we can throw away the other pixels (and their noise contribution), and focus on the one with a signal. (Of course we have to look at each pixel, first).

In an earlier section, we noted that CMB noise at this frequency (water window, 1.5 GHz) in a diffraction-limited dish is about 4E-22 Watts/Hz = 4E-22 Joules, independent of the dish size. The large dish needed to collect enough bit-energy to overcome Johnson noise should also be enough to overcome the CMB, more or less. Since one cannot build a huge dish (except at Aricebo), one must bolt together a bunch of small dishes. To keep the CMB noise from becoming additive, we need to hook up the dishes using a synthetic aperture technique so as to keep the beamwidth down. To provide a more accurate estimate, we need to look at how noise propagates through synthetic aperture electronics, including the down-conversion (digital and/or analog), Nyquist aliasing of the noise in the digitizer, and the fact that this needs to be broad-band to make the PRN detection work ... This requires some hefty work.

To recap: The Shannon-Hartley theorem shows that PRN trickery can serve to pluck a weak signal out of heavy noise. However, it only works up to a point; to go beyond that point, one needs to use antenna trickery to increase the amount of collected energy per bit, while using very narrow beam-widths to minimize noise coming from CMB and galactic sources.

ToDo: Determine if synchrotron radiation is important, and if so, how it affects the signalling theory (since its not (?) white noise). In particular, what is the autocorrelation function for galactic noise in the water window?

However, there may be some interesting games that individuals/amateurs can play, using high-bandwidth Internet connections to build very large synthetic aperture scopes. The basic idea is that 50KHz of bandwidth can be digitized into an approximate 400 kbit/second digital stream, which can be piped across DSL/Cable Internet connections. These can then be combined in more-or-less realtime to provide high-resolution images. Accurate timestamps are needed so that the raw signals can be time-correlated to create the synthetic aperture. The GPS signal can provide a (relatively) cheap microsecond-accurate timestamp. I don't know if this timestamp is enough, or if calibrated atomic clocks are needed. Hmm.... Can one even do interferometry after down-conversion, or must it be done before? I would think so, but maybe one looses too much data after narrow-band filtering. It shouldn't be too hard to make this work, and it could be fun! Small dishes on opposite ends of a continent could potentially generate some rather dramatic high-resolution images of traditional radioastronomy sources!

One of the interesting aspects of such a stunt might be that most terrestrial interference sources are geographically localized, and thus will appear in only one or a few antennas. Thus, antennas scattered across a continent can provide a degree of interference immunity. Satellites, such as GPS, and esp. geostationary satellites, can still blanket a continent, unfortunately.

The biggest problem with arrays of small dishes is their poor performance in rejecting Johnson noise. For example, if 100 small dishes are aimed at the same signal, the total signal power will be 100 times larger than it is for one dish. Noise power from Gaussian white noise sums as the root-mean-square: the total noise power of an array of 100 dishes will be only 10 times that for a single dish. Thus, the S/N ratio for 100 dishes is 100/10 = 10 times better than for a single dish. By comparison, one can achieve the same gain in S/N by increasing the diameter of a dish by sqrt(10) = 3.2. That is, a single 6-meter dish has better S/N characteristics than 100 2-meter dishes. In terms of total cost (including operating cost), building & running a single 6-meter dish is cheaper than 100 2-meter dishes.

- Channel Capacity, a simple derivation (mirror).

Assume a pseudo-random binary bit sequence of length N
is given by b_{k} for integer k in [0,N-1].
Use this to phase-shift modulate a carrier of frequency W. The phase
shift is p, the length of each bit is T. Then the

f(t) = Sum_{k=0}^{N-1} S_{k}(t)
sin (Wt + S_{k}(t) p b_{k})

is the signal defined on the time interval [0,NT]. We used the notation
S_{k}(t) to be a step function that is one on the
interval t=[kT,(k+1)T] and is zero otherwise. To make f(t)
quasi-periodic with period NT, one chooses a frequency W such that
NTW = 2pi m for some integer m. The PRN (pseudo-random noise bit
sequence) then repeats with period NT. This signal, as described
above, does not carry any data; to modulate the signal with data,
one inverts all of the bits to denote a one, or leaves them to
denote a zero. That is, the maximum possible data rate is
1/NT, whereas the 'chipping' rate is 1/T.

If the data consists of all zero's, then f(t) is absolutely periodic,
and a discrete Fourier transform is appropriate to describe the
signal. The signal is given by

f(t) = Sum_{n=0}^{inf}
c_{n} cos (2 pi nt / NT)
+ s_{n} sin (2 pi nt / NT)

Solving for the Fourier coefficients c_{n} and s_{n}
one gets

c_{n} =
1/ (2 pi (m+n)) Sum_{k=0}^{N-1}
[ cos [2pi (m+n) k/N + p b_{k}]
- cos [2pi (m+n) (k+1)/N + p b_{k}] ]
+ 1/ (2 pi (m-n)) Sum_{k=0}^{N-1}
[ cos [2pi (m-n) k/N + p b_{k}]
- cos [2pi (m-n) (k+1)/N + p b_{k}] ]

and

s_{n} =
1/ (2 pi (m+n)) Sum_{k=0}^{N-1}
[ sin [2pi (m+n) k/N + p b_{k}]
- sin [2pi (m+n) (k+1)/N + p b_{k}] ]
- 1/ (2 pi (m-n)) Sum_{k=0}^{N-1}
[ sin [2pi (m-n) k/N + p b_{k}]
- sin [2pi (m-n) (k+1)/N + p b_{k}] ]

Note that if N is the product of small primes, then it is very
likely that adjacent terms will cancel each other out. If
N is a large prime, then adjacent terms will almost certainly
not cancel (unless N divides (m+n) or (m-n) which is unlikely).
Overall, there is more energy spread over a broader bandwidth
if N is prime, although it would be nice to back this
claim with a proof.

From this formula, we can also see that most of the energy will be carried by about 4N coefficients centered about m=n. Thus, the longer the PRN sequence, the more Fourier coefficients need to be considered. From this, one can see that narrow-band interference will wreck maybe a few of the coefficients without spoiling the signal as a whole. In other words, the larger the N, the better the interference suppression.

Based on the above calculations, we can place some limits:

- The maximum reasonable chip rate f=1/T is about 100 MHz. We conclude this by noting that most of the power in the signal is in the band of about 4f, and the width of the 'water window' is about 400MHz.
- Scattering from turbulent galactic gasses will broaden a line by 0.01Hz to 0.1Hz. This seems to provide a natural limit to the spacing of spectral lines. Thus, the smallest reasonable spacing is given by 1/NT = 0.01Hz. Lines spaced more closely than this would seem to bleed energy into each and thus not provide any additional process gain.
- Combining the above, we conclude the largest practical value for N is about 1E10. This provide a maximum of about 100 dB of process gain. Expressed as a power of 2, its 2^33 bits. This number is in the ballpark for todays computers.
- Note that 1/NT is also the data rate. Thus, a natural data rate seems to be about 0.01 to 0.1 Hz.

- A 'perfect' (monochromatic) CW signal will be 'Doppler broadened' by scattering from turbulent interstellar gas. The smearing is approx 0.01Hz to 0.1Hz for the water window. (Need reference).

For a given length N, how many different PRN sequences
are there? The answer depends on how 'good' one wants
the PRN sequence properties to be. A 'good' sequence
will have the same Poisson distribution of same-bit
lengths as white noise. A 'good' sequence will spread
power evenly over all of the Fourier coefficients of
of the signal. In fact, most random sequences will be
'good' sequences, and there are approximately 2^{N}
of these. For N=1E10 this is insanely large; it is
impossible to check all of these. Thus, we need to
limit the number of sequences that might be searched
for. To do so, we need to guess the kinds of sequences
that ETI might choose to use.

An obvious set of candidates are those sequences that
have small polynomial generators: *i.e.* those
that can be generated by small shift registers whose
various taps are XOR'ed together and fed back. Sequences
of length N will typically have polynomial generators of
degree log_{2} N. Thus, for a bit sequence of
approximate length N = 2^{32} we can work with
a shift register of about 32 bits. There are 2^{32}
ways to XOR the 32 taps together; however, most of these
will fail to generate long sequences. I believe (this needs
support/derivation) that about sqrt (2^{32} =
2^{16} polynomials are reasonable PRN generators.
If this is correct, then there are approx
1E10 x sqrt(1E10) = 1E15 different
PRN sequences that one might have to search for. This
is starting to get to be a dauntingly large number. Don't
forget that for any given sequence, one has to try each
of the N different bit alignments, which implies
1E10 x 1E15 = 1E25 possibilities to search. Each of these
in turn needs to be tried at many different frequencies and
chirp rates. One might limit the later searches to chirp
rates that assume that the transmitter has been corrected
to appear stationary in the galactic frame of reference.
However, one might also like to limit the number generators
to attempt.

To this end, one might attempt to look for notable primes
and generators. Are there notable, 'famous' primes in the
vicinity that are 'preferable' in some way?
e.g. 2^{32}+-1?
Are there notable generators? e.g. generators that not only
have long sequences and good PRN properties, but are also
notable by their relationship to other famous problems,
e.g. finite/sporadic groups, Fermat's Last Theorem, etc.?

- Counting Primes provides some estimates for the number of prime numbers.
- Chirp due to earth spin is 0.16 Hz/sec

- Since spread-spectrum is an inherently broadband phenomenon, most astronomy receivers are useless, as they are designed to be narrow-band.
- LNBF -- Low Noise pre-amplifier, Block Frequency Down-converter is ideal. The point being that PSK signals survive down-conversion beautifully.
- Noise figure of 0.4 dB in low-noise preamp Model 1691ULNA from Downeast Microwave, Inc.
- Haystack SRT Small Radio Telescope Receiver Details
- Antennas a Russian academic Journal.
- Synthetic Aperture Theory
- Synthetic Aperture Imaging with Arrays of Arbitrary Shape (abstract only)
- The Society of Amateur Radio Astronomers (SARA)

Draft of April 2003, Linas Vepstas

Spell checking and minor additonal URL links, August 2006.

Copyright (c) 2003 Linas Vepstas linas@linas.org

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included at the URL http://www.linas.org/fdl.html, the web page titled "GNU Free Documentation License".