By Constantino Panagopulos
A new certificate program at The University of Texas at Austin’s Jackson School of Geosciences will build critical skills for the geosciences workforce by training students and professionals to use modern data analytics tools and machine learning.
Machine learning – a kind of artificial intelligence – helps geoscientists make better use of modern Earth monitoring methods, such as satellites, weather stations, and seismic sensors, which generate massive volumes of data. By using computers to turn the data into useful information, scientists can solve geoscience problems much quicker and more efficiently than by traditional methods.
Program Leader Mrinal Sen, a professor at the Jackson School and the Jackson Chair in Applied Seismology, said the certificate program will better prepare students for the real world challenges of a career in geoscience.
“Employers are realizing that geoscientists need to be able to make use of machine learning tools,” Sen said. “That’s why we need to train our people to use these tools and show them how they work.”
The certificate program teaches students the fundamental concepts of machine learning, including data analytics, geostatistical and numerical modelling, and artificial intelligence. Students also learn how to use the programming language Python, a tool that has been widely embraced by the geoscience community for its wealth of open-source packages aimed at efficiently solving real world data problems.
Alongside core theoretical components, students will have the chance to apply what they learn to a range of environments, from Martian weather to the Earth’s interior. Program instructor Michael Pyrcz, an assistant professor at the Jackson School and the Cockrell School of Engineering, said that while the program teaches students how to apply machine learning to geoscience problems, the skills they learn could be applied to any data-driven field.
“Data-driven opportunities are everywhere,” said Pyrcz, who runs a popular series of online lectures on YouTube as the GeostatsGuy. “Geoscience students are unique: they have excellent skills in spatial data, information integration, and scientific problem solving, they know big data, they learn fast and enhance their impact with [artificial intelligence].”
Grappling With Data
Pyrcz is a recognized expert in his field—he literally wrote the book on spatial data analytics and geostatistics—and is joined on the program by an emerging generation of “data-fluent” geoscientists, such as Jackson School Assistant Professor Daniel Trugman.
“Our field has always been a data-heavy science,” said Trugman, “but I think in the last 10 years or so, the volume of data has really exploded across the world.”
Trugman gives the example of California’s seismic network, which includes 500 seismic stations in Southern California alone, each recording over 100 measurements per second, and a similar network, TexNet, which came online in Texas in 2017 and is operated by the Jackson School’s Bureau of Economic Geology. These networks can pinpoint the strength and location of tremors occurring across the world, but their sensitivity comes at a cost.
“In Texas, in particular, there’s a lot of industry and traffic that shows up as noise in the data,” said Trugman. “We can use machine learning algorithms to basically de-noise seismic waveforms and leave you with just a signal.”
De-noising data means teaching a computer to recognize what’s useful – in this case a seismic wave – and throw out junk data such as traffic, industrial noise, crashing waves, and even the wind. As it churns through the data, the computer gets better and faster. A human could do the same work, but it would take years to get through what a computer could do in a few minutes.
Trugman uses the de-noised data to detect tiny tremors around a fault in the moments before an earthquake is triggered. This method is a huge step in earthquake science and could be used to improve the accuracy of future early warning systems.
Shaping Data into Insight
Cleaning up data is an important but relatively simple application of machine learning. A similar technique helped Jackson School graduate student Ben Rendall build a geologic map of the Bahamas. It was a feat that would have taken years to complete using traditional methods but took only a few days using a semiautomated mapping workflow, a lot of satellite data and a handful of geologic reference points.
The real revelation, however, came when he compared the geologic maps he created with data about the islands’ climate and weather. Using an analytical workflow he learned at a Python for Geosciences class, Rendall was surprised to discover that dominant features in the islands’ landscape are oriented in the same direction as prevailing winds that have blown through the region in much the same way since the last ice age.
“Data analytics is about turning a million rows on a spreadsheet into meaningful information and stories,” Rendall said. “You still need the field studies and traditional analysis to calibrate the programs, but when you integrate the two the applications are tremendous, things that would be impossible to do otherwise.”
Rendall said the information his project generated could potentially help find and estimate how much freshwater is stored within the Bahamas dune fields. With a little additional analysis it should be possible to predict what will happen to that water as sea level rises and wind patterns change. For an island chain with no other source of freshwater, these kinds of insights are critical for the future of Bahaman communities.
The AI Elephant in the Room
Geoscientists have examined the Earth to understand its many resources and potential hazards since the emergence of modern geology. Researchers like Trugman and Rendall have the same goals as geoscientists from earlier generations, but the techniques they use allow them to tackle climate, earthquakes and new energy sources in ways these geologists could have never have imagined.
Jackson School Professor Charlie Kerans, who supervised Rendall’s doctoral thesis, admits he is blown away by the power of machine learning and artificial intelligence.
“There’s no hiding from it,” he said, “but we can’t expect a computer scientist who doesn’t have a background in the geosciences to know what questions to ask of the data. That’s why we need to train our students to use these tools and show them how they work.”
Kerans, who is the outgoing chair of the Jackson School’s Department of Geological Sciences, has been at the center of plans to create a machine learning curriculum since he first noticed that a high interest in machine learning wasn’t being addressed by the department’s curriculum.
According to Kerans, money for machine learning projects was flowing onto campus but little of it was reaching the department. Meanwhile students were flocking to the few machine learning and data analytics courses already offered by the department, such as the Python for Geosciences class that inspired Rendall’s research.
Around the same time, a National Science Foundation-funded report on the employability of future geoscience students led by former Jackson School Dean Sharon Mosher, concluded that data analytics and machine learning are among employer’s most sought after skills.
Yet the majority of student were graduating without those skills.
One of the problems, according to Kerans, was that for too long, the school had sold itself short on impressing students that the geosciences are more than energy and dinosaurs.
“This isn’t just about training geoscientists to think differently, we’re changing what people think we do,” he said.
The final piece in the puzzle was seed-funding from Chevron, whose previous support for the department had helped create a new remote-learning, virtual reality classroom, and paid for student field classes.
With Chevron’s cash injection and new additions, such as Trugman, to the department’s faculty, the pieces were in place for a new kind of program that would benefit Jackson School students and help redefine what it means to be a geoscientist.
The result is the new certificate program, Geological Sciences: Machine Learning and Data Analytics for Geosciences.
About the Program
Machine Learning and Data Analytics for Geosciences is a stackable certificate program, which means it is available to students seeking degrees and working professionals, and focuses on the skills and knowledge that will be useful in their careers
The program will train around 16 students per year and is available to current UT students and working geoscientists alike. Students will be required to take 12 hours of core coursework including a project based on solving real-world geoscience problems.
This story originally published in Texas Geosciences.