A Machine Learning Technique to Identify Transit Shaped Signals

Daniel Wysocki • Sep 2, 2015 • journal_club

Paper

“A Machine Learning Technique to Identify Transit Shaped Signals” by Thompson et al.

Supervised machine learning is used to automatically identify expolanets from Kepler light curves.

Figure 1 shows some transit-like TCEs, folded on the left, and binned on the right. Figure 2 shows the same for non-transit-like TCEs.

Figure 4 shows a histogram of log(T_LPP) for (non-)transit-like TCEs.

Figure 5 shows a histogram of log(T_LPP) for injected transits, divided into pass/fail.

(Summarized in Section 3.2, page 3, 3rd paragraph in the right column)

Each star’s light curve is folded on the Threshold Crossing Event (TCE) periods provided by the NExScI archive
An equal number of bins near the TCE are chosen for each light curve
- Bins are selected such that 51 lie within, and 90 outside the transit
  - (See Section 3.5 on page 5 for details)
The mean of each bin is used as the magnitude
Points are sorted by phase
Points are normalized such that the minimum occurs at -1

(Summarized in Section 3.3, beginning of page 5)

Locality preserving projections (LPP) projects the binned light curves into a lower dimensional space
- Algorithm is similar to PCA
  - Less sensitive to outliers
  - Better at preserving locality (hence the name) for methods like k-NN
- produces a 20-dimensional feature vector for each event

(Summarized in Section 3.5, pages 5 and 7)

Removes over 90% of non-transiting candidates from Kepler data, and retains over 99% of known transits
Loses 1% of injected transits
- (Injection described in Section 3.6, page 7)