The man who found a world: detecting an exoplanet

Skepticism over using artificial intelligence for scientific discoveries is best beaten with hard evidence, says a researcher who found an undiscovered world with a little help from machine learning.

“Pretty exciting. It’s not something we expected,” says University of Georgia doctoral student Jason P. Terry when describing to Cybernews how it feels to discover a whole new world orbiting a distant star.

Until recently, humans had only ‘discovered’ two new planets: Uranus and Neptune. These cold gas giants, at the edge of our Solar System, are so far away that both – unlike the others – are practically invisible to the naked eye.

While there are only eight planets in our Solar system, thousands lie beneath our Sun’s grip. Extrasolar planets, or exoplanets for short, were first proven to exist in 1992 after the discovery of Poltergeist and Phobetor circling an exotic pulsating star 2,300 light years away from Earth.

Since then, over 5,500 exoplanets – completely alien worlds – have been confirmed, pushing scientists to accept that almost every single star in the night sky likely harbors at least one planet in its orbit. Machine learning may hugely accelerate the pace of discovery.

How are exoplanets discovered?

Currently, two methods of exoplanetary detection have been most fruitful: the transit method (4125 planets discovered) and the radial velocity method (1066 planets).

The first method focuses on observing the slight dimming of the star. When an object passes directly between its star and an observer, it dims the star’s light by a measurable amount, indicating the presence of a planet.

Jason P. Terry. Image by Cybernews / Terry.

The second method, also known as “watching for wobble,” detects if a planet impacts its parent star by causing it to wobble. The technique relies on the fact that a star does not remain stationary when a planet orbits it.

Meanwhile, Terry and his team chose a different path: AI. To make things interesting, his team focused on a star with a protoplanetary disc, a planet-forming accumulation of gas and dust around a young star.

“The AI was designed to determine whether or not a given observation of a protoplanetary disk contains a planet. However, there are not enough observations to train effective models at this point,” Terry explained.

How do you train your planet-hunting AI?

To generate the dataset that Terry’s AI uses to hunt for planets, researchers ran an abundance of computer simulations.

Simulated star systems would allow scientists to know the exact parameters of how a protoplanetary disc containing planets looks and behaves. To verify the accuracy of the simulations, researchers applied them to all planetary disks that had previously been found to have planets.

“That allowed us to create machine learning models that could then be applied to observations. So, using these methods, we would pass the observations through the machine learning models. The models would give us an estimate of the confidence that it thinks there’s a planet,” Terry said.

Once the AI is trained, it’s given actual observational data. One of the stars with protoplanetary discs that Terry’s team was interested in was HD 142666, a yellow-white star 477 light years away from the Sun. A mere stone’s throw away in cosmic terms.

“After we ran HD 142666 through the models, the majority of them came back with a 95% confidence level. Which means the model was very confident that there was a planet,” Terry remembered.

There it was, HD 142666 b, a previously undiscovered world. Five times more massive than Jupiter, the largest planet in our Solar System, Terry’s planet is 75 astronomical units (AUs) away from its parent star. For comparison, one AU is equal to the distance between the Earth and the Sun.

Protoplanetary disc. — Locating a planet in a protoplanetary disc. Image by Terry.

Locating the invisible

The neat part of applying machine learning models for planetary discovery is that researchers can look inside the model to see what it is focusing on when it decides where in the image the model thinks the critical information is.

Even if the model is insufficient to determine whether a planet is hiding in the data, researchers can focus on specific data points that the model deems critical. That’s precisely what happened with HD 142666 observations.

Once the AI pointed out that the star system might harbor a planet, Terry’s team focused on the exact data point that the model said mattered the most. After analyzing observational data, they could conclude there’s a disturbance that looks exactly like one a planet would make.

“The purpose of the AI model is to direct discovery and analysis. We looked at that region because the model said that it was important, allowing us to hone in on that specific region, to uncover this information that had been obscured for so long,” Terry said.

When can I find a planet?

The implications of applying machine learning to planetary detection are far-reaching. For one, while hardly perfect, AI is very good at crunching large datasets. Armed with planet-detecting models, scientists could sift through vast amounts of already existing stellar data, looking for evidence of planets.

Terry believes that the model he and his team developed could be perfected to look for planets around fully formed stars. Machine learning might become so sensitive that it could spot smaller planets, much like the ones in our solar neighborhood. Worlds that could have rocky surfaces, warmth, liquid water, and maybe even life.

“The nice thing about these methods is that they’re pretty general. We applied them to protoplanetary disk observations, but what we’re developing is completely general towards any observation that produces this data structure. We can apply these methods towards observations that are more likely to include signatures of rocky exoplanets, the exciting ones,” Terry said.

While it will require additional time and resources, the model used to discover HD 142666 b could be repurposed for use by the general public. This means that citizen astronomers could soon join the planetary hunt, increasing the number of known exoplanets severalfold.

“It’s not deployed yet, but that is very much in our future. So that anyone could use it, anyone could then discover an exoplanet. Of course, experts would have to investigate if an enthusiast said there’s an exoplanet in the data. But citizen science contributing towards those avenues of investigation is something that can be achieved,” Terry said.

Evidence will expel skepticism

One of the most promising venues for the scientific application of machine learning in astronomy is observational analysis. For example, powerful tools like the James Webb Space Telescope (JWST) are capable of integral field spectroscopy (IFS), mapping the whole universe.

“There’s no reason why our methods can’t apply to this data in general. We could be looking at supernova remnants, a whole host of really important, interesting things. For context, supernova remnant images in this same type of data have been used to find neutron stars,” Terry said.

For example, the first exoplanets were discovered after revisiting old data. While it was humans, not AI, who made the discovery, it serves as an example of the treasures hidden in the vast volumes of data scientists may deem already processed.

However, before astronomers and astrophysicists add AI to their toolbox, machine-learning skeptics will have to be proven wrong. As anyone who has ever used generative AI like ChatGPT or Google Bard can tell, contemporary tools are far from perfect and are even prone to hallucinations.

Scientists mapping the skies may be skeptical about the black box problem, where there’s no clear answer as to why an AI model provides the results that it does. For example, a machine learning model could tell scientists the data indicates the presence of a galaxy but could not tell why it came to such a conclusion.

However, Terry is optimistic. He says that even in the past several years, researchers have started to get acquainted with using AI for research. And the ones who don’t eventually will.

“The field’s coming around, but I do still think that not everyone is fully on board, and there are legitimate reasons why that happens. But at the end of the day, you can’t argue with the results,” Terry concluded.