Data is a critical building block of a fast-approaching future, and Rensselaer is ensuring that all of its students are adept architects with the adoption of a new institutewide requirement in data education. The requirement, the first of its kind in the nation, will propel all Rensselaer students beyond the current collegiate standard of “data literacy” to “data dexterity”— proficiency in using diverse datasets to define and solve complex real-world problems.

The data requirement is part of an updated core curriculum—the common academic and non-academic elements that all Rensselaer students must complete to graduate—that reflects the skills and capabilities graduates need to be tomorrow’s global leaders and problem-solvers. Data dexterity is a critical piece of that equation. A 2017 market analysis from the Business Higher Education Forum calls for annual job openings to rise steadily to 2.72 million postings for data science and analytics roles in 2020.

“Our future is increasingly data-driven, and successful leaders must be able to harness the power of data in solving the problems they tackle,” said President Shirley Ann Jackson. “Key elements of that data dexterity are an ability to leverage data to make decisions, to understand the difference between causation and correlation, an understanding of ethical use of data, and how to visualize complex data in ways that emphasize key mechanisms. All Rensselaer graduates will have those skills.”

A “data-intensive sequence,” as approved by the Faculty Senate, requires all students at Rensselaer to complete two “data-intensive” courses; one to establish the foundations of data modeling and analysis, and a second within their academic discipline. Rather than increase the overall number of credit hours students must earn, curriculum on data awareness and exposure will be infused into existing courses that will be designated as “data-intensive.” Building on concepts introduced as part of a National Science Foundation (NSF)-funded pilot program, Rensselaer has also developed new courses and opportunities for students who wish to explore data-driven study, such as a data-centric laboratory experience that connects teams of students with industry partners to tackle a data-intensive problem.

The new core curriculum, spearheaded by the Faculty Senate Curriculum Committee, ensures that Rensselaer graduates are creative and critical thinkers who can marry disciplinary expertise with interdisciplinary collaboration. In addition to the “data-intensive” requirement, the new core includes a new requirement for co-curricular activities such as capstone projects within the disciplinary majors, and a Rensselaer enrichment core, including a common reading experience, an away experience, and co-curricular academic and cultural activities, as well as revisions to existing “communication-intensive” requirements and flexible science, humanities, and social sciences course requirements.

“The design of the new core curriculum was a faculty-driven process, with members of all five schools participating. It represents a new vision of the foundation of the Rensselaer education, emphasizing the importance of co-curricular activities as well as curricular experiences,” said Lee Ligon, chair of the Core Curriculum Committee, associate professor of biological sciences, and an associate dean in the School of Science.

The data requirement acknowledges that data is increasingly pervasive in tracking aspects of life, from massive systems like global climate to the most intimate realms of personal health and fitness. At Rensselaer, for example, students have used data to quantify the impact of human activity and climate change on aquatic ecosystems that allow us to develop concrete mitigation approaches. As a digital reflection of our physical reality that can be manipulated, analyzed, and queried, massive datasets offer a tantalizing potential to yield insights that would otherwise be out of reach.

The new requirement exposes Rensselaer students to data analysis through coursework in relevant elements of machine learning, linear algebra, optimization, and modern approaches to statistics. Students also will be exposed to issues of ethics and policies related to data’s use and misuse.

“At Rensselaer, we believe that the ability to manage and exploit data is a fundamental skill that all of our graduates should possess,” said Prabhat Hajela, Rensselaer provost. “We educate our students to be leaders in whatever endeavor they seek to pursue, and dexterity in handling large and diverse datasets will be a critical component in their success.”

The three-year NSF-funded “Data Analytics Through-Out Undergraduate Mathematics Program,” or DATUM, pilot program developed curriculum to introduce big data techniques throughout the curriculum. As part of DATUM, Rensselaer faculty developed a data mathematics course for early math students, offered a summer research experience to students who completed the course, and created guidelines for incorporating data analysis and modeling techniques in courses across the curriculum and providing a capstone experience.

Two elements of DATUM, the Introduction to Data Mathematics (IDM) course and a companion data research capstone experience in the Data INCITE (Data Interdisciplinary Challenge Intelligent Technology Exploration) Laboratory, are now regularly offered courses.

The IDM class introduces early undergraduate students to concepts and techniques, like high-dimensional mathematic models, typically not fully discussed until senior year, said Kristin Bennett, a Rensselaer math professor and DATUM principal investigator.

“We asked the question, ‘what do you really need to know to do data analysis?’ And we built a class that introduces the mathematics that you really need—elements of linear algebra, optimization, multi-variable calculus, statistics—in the context of data analysis,” Bennett said. “Ordinarily, students would learn these tools bit by bit throughout their undergraduate math education, but our class brings it all together and makes it possible for them to tackle data analysis right out of calculus.”

Students who completed IDM were eligible for a summer research program where they tackled research questions such as using data to analyze advanced manufacturing techniques, understanding the biomedical aspects of circadian rhythms, and analyzing the ecological implications of chemical changes in Lake George. In an NSF evaluation of the pilot, students who completed IDM and the summer research reported that the experience strengthened their professional potential by developing marketable skills and revealing the opportunity to pursue a broad variety of fields, from engineering to the arts, through data. The success of the program provided evidence to support the faculty in developing the new requirement.

In an NSF review of the DATUM pilot project, an independent review reported that “findings suggest that students who participated in IDM and summer research were profoundly shaped by their experience. Many students’ professional pathways were ‘reshaped,’ and they now saw a wide range of professional opportunity when they combined their current interests with data analytics.”

“I’m really excited to see this new program put into place,” said Jim Hendler, director of the Rensselaer Institute for Data Exploration and Applications (IDEA). “Graduate students and faculty throughout our campus have increasingly been using advanced data analytics and machine learning in their research. With this new requirement, all of our undergraduates will be exposed to these exciting new technologies.”