First: don't say FFT when you actually mean DFT (Discrete Fourier Transform) (FFT is just an algorithm that computes efficiently the DFT).
Second: the Fourier transform of random data (stochastic process) is rather tricky to work with/interpret. You should first try to understand the DFT for deterministic data.
Third: in most typical applications, you don't take the Fourier transform of the "full signal" (12489 samples=31 seconds), but rather segment it in short "frames" and take the DTF of each frame.
What should be the average amplitude of the data, post-FFT?
You must remember that the DFT is not real but a complex signal. If you are insterested only in magnitudes, of course you can take the (squared) absolute value of it. Now, if the signal is random, this is equivalent of getting a Periodogram, which is an estimate of the Spectral density of the signal. The "spectrum" (not random) of a random signal is the fourier transform, not of the signal itself, but of the autocorrelation function. Informally, it measures how much "energy" the signal has in each frequency band.
So, the answer of your question is not simple. The only simple property that could help is is the Parseval theorem: this says that the mean squared value of the spectogram equals the mean squared value of the signal ("total energy"), properly normalized.
Another property (for deterministic signals) is that the zero frequency value of the DFT is the mean value of the signal, properly normalized.
what is the significance of the maximum amplitude of binary data that
is not random, but consists of 1,0,1,0,1,0 data (12489 points).
Such a signal has almost all its enery at the highest frequency (plus a zero-frequency component, given by its mean value =1/2. Hence, its DFT will be practically zero everywhere except at frequency zero, and at k=N/2 (wchich corresponds to maximum frequency).