June 16, 2024


Built General Tough

Pitt unveils new data science major

Pitt senior Gordon Lu chooses majors more easily than some students pick out their lunch at the dining hall. A little bit of economics, a lot of statistics, a sampling of mathematics and heaps of computer science.

But he says one major would’ve put less on his plate and allowed him to combine all of his interests — data science.

After four years of planning and fine-tuning, Pitt quietly released a list of course requirements for a new data science major over the summer, which students will be able to declare when the fall semester starts on Friday.

Lu, a quadruple major, said the move is “a really good step.”

“I know there are a lot of people in computer science, economics and statistics [who] are all really interested in data science,” Lu said.

It is the fourth major at Pitt to be offered jointly by the School of Computing and Information, or SCI, and the Dietrich School of Arts and Sciences. At 61 credits, the major pulls courses from four different departments — computer science, information science, math and statistics — giving students a strong foundation in programming and quantitative disciplines.

Those skills are in high demand on the job market. Forbes branded data as “the new oil” in a 2019 article. And data science jobs, which often command six-figure salaries, are projected to grow more than 30% in the next five years, according to one estimate from IBM. That’s because data is ubiquitous, said Prashant Krishnamurthy, chair of the department of informatics and networked systems.

“No matter which industry or which human activity we take up these days, we are all generating data,” Krishnamurthy said. “But data by itself may not be useful unless we are able to get some meaningful information out of it.”

The new major will offer many classes to help students wrangle cumbersome datasets and develop insights.

As a foundation, students will need to take calculus, probability and statistics, as well as a programming sequence in the computer science department. From there, students can move on to the “expertise” section, where they’ll use their base knowledge to analyze real world data. Pitt will also require data science majors to complete a specialization — Computer Systems, Data Analytics, Data Science in Context or Modeling — along with a capstone.

The foundational courses will mostly be test based, while later classes will allow students to complete their own projects, according to the syllabuses of the required courses.

Through completing the major, students will learn several different programming languages — R for statistical modeling and computation, Python for machine learning and Java for general computer science. Kostas Pelechrinis, who had a hand in designing the major, said the goal was to help students be adaptable.

“You can imagine that when students go out in industry, maybe Python is obsolete — won’t be in two years, but it might be in 10,” Pelechrinis, a SCI professor, said. “They will need to be able to pick up these new tools.”

Incoming data science students will also need top notch problem solving skills, a strong math background and ideally some exposure to programming before starting college, Pelechrinis said.

“They will need to have this computational thinking — analytical thinking,” he said. “Students who declare this major need to have this interest in learning many different things.”

Despite the growing popularity of data science, Krishnamurthy said Pitt has been slow to roll out the major because planning couldn’t begin until the University formed its School of Computing and Information in 2017 and the COVID-19 pandemic stalled the approval process.

From start to finish, the process took about four years, according to several faculty involved in creating the major, including Krishnamurthy and Pelechrinis. In that time, dozens of other universities, including Penn State and Temple, have introduced data science majors. Compared to Temple’s program, which requires physics or chemistry and three semesters of calculus, Pitt’s major requires seven fewer classes — less math and no hard sciences.

“I think that even though it was a little later than other universities, the final product is just as good, if not better,” Pelechrinis said.

The data science major has also led the statistics department to modernize its curriculum, offering more programming classes, according to statistics department chair Satish Iyengar. In the past few years, the statistics department has introduced a two-course sequence — STAT 1261 and 1361 — teaching programming for statisticians. Several classes in the department will begin teaching R, a more modern programming language, instead of older statistical packages. The introductory statistics courses — STAT 1000 and 1100 — will still use Minitab and Excel, respectively, for now. 

Many Pitt students interested in data currently major in statistics, said Lu, who runs a data science club called Data Analytics Through Applied Statistics, or DATAs — but he doesn’t know many students who are aware of the data science major. Now that the major has been approved, several faculty said the next step will be to advertise it to the student body. 

Rachel Kelly, a spokesperson for SCI, said Pitt is currently advertising the major via the University website, student events and during advising appointments. Promotion efforts, she said, “will continue throughout the term.”

Pelechrinis said he hopes students consider the data science major, because data has become widespread in industries and everyday life.

“Being able to understand what the data tells you and what it doesn’t tell you … [being] able to use data to actually make informed decisions … It’s pretty important,” Pelechrinis said.