Machine Learning in the atmosphere
Fourier analysis¶
So, the story so far:
Looking to see if there is a relationship between the tidal pattern in pressure data and the moon.
The pressure data gives a waveform. The Fourier transform of this data we can converts the data from a time series to give us the underlying frequencies driving that data.
In [111]:
# Tell matplotlib to plot in line
%matplotlib inline
import pandas as pd
# seaborn magically adds a layer of goodness on top of Matplotlib
# mostly this is just changing matplotlib defaults, but it does also
# provide some higher level plotting methods.
import seaborn
# Tell seaborn to set things up
seaborn.set()
from matplotlib import pyplot
import mpld3
In [112]:
# interactive plots -- except in ein
#mpld3.enable_notebook()
In [113]:
infile="../files/moon_weather.csv"
In [114]:
import numpy
from numpy import fft
In [115]:
data = pd.read_csv(infile, index_col='date', parse_dates=['date'])
In [116]:
data.altitude[:24480].plot()
Out[116]:
In [117]:
data.describe()
Out[117]:
In [118]:
24480/720.
#(60. * 12)
Out[118]:
In [119]:
N = 24480
fft_altitude = fft.fft(data[:N].altitude)
In [120]:
# Plot the transform. Most of the energy in the signal is at low frequencies
pd.Series(x.real for x in fft_altitude).clip(-1000, 1000)[:].plot()
Out[120]:
In [121]:
# spike of energy at location 34
pd.Series(x.real for x in fft_altitude)[20:40].plot()
Out[121]:
In [150]:
# Convert location into period
point = 34
for p in (point-1, point, point+1):
print(p)
print("%8.3f %s" % (N/p, "minutes"))
print("%8.3f %s" % (N/(p * 60), "hours"))
print("%8.3f %s" % (N/(p * 60 * 24), "days"))
The spike of energy at location 34 corresponds to a signal with a period of 12 hours.
Note that the amount of data available limits the resolution of transform.
A peak at position 33 would give a period roughly 22 minutes longer, at 35 it would be 21 minutes shorter.
It looks like the tidal pattern in the pressure data has a period of 24 hours.
Hence, it looks like it is driven by the sun, not the moon.
In [123]:
# transform is complex, look at the imaginary component
pd.Series(x.imag for x in fft_altitude)[0:60].plot()
Out[123]:
In [124]:
# remove 12 hour and higher freq components:
xx = fft_altitude.copy()
xx[point:-point] = 0.0
xx[int(point/2)] = 0.0
#xx[point+1:] = 0.0
In [125]:
smooth_altitude = fft.ifft(xx)
In [126]:
pd.Series(y.real for y in smooth_altitude).plot()
Out[126]:
In [133]:
df = pd.DataFrame()
df['altitude'] = data.altitude[:N]
df['smooth_altitude'] = [y.real for y in smooth_altitude]
df.plot(subplots=False)
Out[133]:
In [147]:
smdf = pd.DataFrame(dict(smooth_altitude=smooth_altitude, fft_alt=fft.fft(smooth_altitude)))
smdf.fft_alt[6:50].plot()
Out[147]:
In [157]:
# moon takes 29.53 days to be in same position relative to earth/sun
N/(29.53 * 24 * 60)
Out[157]:
In [135]:
pd.Series(fft.fft(df.smooth_altitude)
Out[135]:
In [129]:
import math
sine = [numpy.sin(x/math.pi) for x in range(1000)]
In [130]:
xfft = fft.fft(sine)
df = pd.DataFrame()
df['sine'] =sine
df['ffti'] = [x.imag for x in xfft]
df['fftr'] = [x.real for x in xfft]
df[:].plot(subplots=True)
Out[130]:
In [131]:
df.describe()
Out[131]:
In [132]:
df.head(20)
Out[132]:
$$A_k = \sum_{m=0}^{n-1} a_m \exp\left\{-2\pi i{mk \over n}\right\}
\qquad k = 0,\ldots,n-1.$$
$$a_m = \exp\{2\pi i\,f m\Delta t\}$$
where $\Delta t$ is the sampling interval
Comments
Comments powered by Disqus