During the past decade, campuses have begun leveraging large sets of data to uncover previously unseen trends and drive strategic decision making. From measuring financial health, to optimizing enrollment and aid packages, to supporting student learning and advising, data analytics has the potential to help an institution meet its mission and improve student success.
For instance, the capacity to leverage diverse data sets can provide opportunities for campuses to better serve students by learning about their interests and behaviors and by encouraging them to make successful choices.
Consider the University of Arizona, Tucson, which used identification card swipes to track student location and time-stamp data to gain insight into how students were spending their out-of-class time. By tracking card use for vending machine purchases, library interactions, and residence hall access (among nearly 1,000 other data points on campus) and combining these data with demographic and performance data, the institution could predict the likelihood that a particular student would drop out. That likelihood might be higher, for instance, if a student was accessing his or her room late at night or spending limited time in study sessions at the library. These predictions were then given to advisers, who were able to intervene and work with students to alter their routines and establish plans for success. By using analytics, in concert with adviser interventions, the university increased student retention rates by almost 3 percent.
Data such as these can be used for more than just supporting student success. A hallmark of analytics technologies is the ability to recombine and repurpose data for new priorities and analyses. For instance, when recombined with facilities-based data sets, student location data can also help an organization understand space and resource use and needs on campus. The flexibility of analytics to generate metrics on various institutional priorities and to inform decisions is among its strengths.
Yet, inherent in the unique nature of data analytics is the potential for users to violate privacy and ethics rules. With vast amounts of data being produced by students and being used by colleges and universities to meet institutional goals, it is important to consider the parameters of that data production and use. Given higher education’s legal duty of care to students, business officers and other leaders on campus must pay attention to, and understand, the ethics of using student-generated data, which is central to unbiased and accurate decision making.
Beware of Black Boxes
Analytics systems are often referred to as “black box technologies” because they are typically proprietary in nature. Often, vendors offer little transparency about what is collected, analyzed, and used. As a result, data users are in the dark about which variables vendors use to make predictions about which outcomes. This lack of transparency creates an opening for vendors to create biased and decontextualized algorithms—the formulas used to make predictions—which can hamper the ability of organizations to make accurately informed decisions.
Scholars are working to shed light on the potential for discrimination and bias to be baked in the design of algorithmic-driven systems such as analytics technologies. These include Safiya Noble, who has written Algorithms of Oppression: How Search Engines Reinforce Racism (NYU Press, 2018), and data scientist Cathy O’Neil and her book Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Crown, 2016).
Algorithmic discrimination and bias can appear in higher education when, in an effort to provide intervention with resources and support, algorithms are used to identify students who may be at risk of failing. Many algorithms are constructed from performance and demographic variables to predict an outcome. For instance, a particular algorithm or predictive model might tell you who your low-income students are and then flag all low-income students for an intervention. Because a student’s demographics (race, gender, class, etc.) are immutable, it is unethical for campus leaders to correlate demographics with performance.
The capacity of algorithmic correlations and predictions to provide a depth of understanding to help solve problems related to inequities is limited. Furthermore, they can actually reflect and magnify patterns of discrimination. For instance, based on their characteristics, course load, and GPA, students could be dissuaded from taking a certain course or majoring in a field of interest because data analytics tools suggest, in advance, that they may not be successful. As a result of biased algorithms, students who are less prepared for college, who are first-generation college students, or who come from less affluent backgrounds could be more likely to be tracked onto specific pathways for completion.
While algorithms can be powerful tools, it is important to remember that they do not always take into account the interests, desires, and goals of students. Nor do they take into account the structures that might exist as deterrents to student success—variables that actually may be within an institution’s control to change.
The importance of understanding institutional and student contexts is vital when interpreting data, yet is often impossible given the lack of transparency of analytics tools. Algorithms that might work within one context do not always translate to another. For instance, algorithms developed and trained at small liberal arts colleges may not scale effectively for use at a large research institution or within a community college environment, where the student and institutional contexts are decidedly different.
The harm of black box systems is in their potential to inform data-driven decision making that is uninformed by the specific priorities and goals of an institution or the needs and interests of its students. Despite a lack of evidence, analytics users often assume that algorithmic decisions have greater accuracy, precision, and consistency than the decisions of their human counterparts. But black box algorithms actually elevate the importance of the correlation or prediction above data analysis because the details of the data are hidden. Assuming that using data analytics and black box algorithms is a superior way to make decisions is dangerous because it obscures and decontextualizes the nuances present in data and ultimately sidesteps the students who produced these data.
Codes of Practice
The good news is that business officers can take the lead on developing collaborative codes of practice—the guiding mechanisms for appropriate data use—and data governance policies that recognize the vested interest of students in how, where, and by whom their data are used. Ideally, business officers will do so through the lens of “data justice” and “data care.” These terms are evolving, but they generally speak to the need for an equitable approach to data use that puts the needs, interests, and contexts of those producing data (for example, students) at the forefront of analytics design, analysis, and decision making.
Given the nature of analytics and the consequences of the black box technologies that deliver these data—not only to inform institutional priorities, but also to shape student outcomes—it is necessary for higher education administrators to approach the use of these data thoughtfully. Unfortunately, there are few existing, and no universally accepted, guidelines or policies for practice. Current policies or guidelines are often limited to explanations of federal privacy policies or articulation agreements centered on data security and access.
In an effort to provide better guidance and address ethical issues, scholars have developed a number of codes of practice that focus on promoting data transparency, security, ownership, control, stewardship, and trust. Although these codes of practice provide useful guidance, they are still evolving and fall short of comprehensively addressing the contexts and needs of organizations and students from an ethical perspective.
To address the bias and potential for discrimination in data analytics, Linnet Taylor—a data analytics researcher at Tilburg University in the Netherlands—has developed a data justice framework based on three pillars:
1. Visibility includes access to information, representation, and privacy, and focuses on how individuals are represented, profiled, and monitored through analytics systems.
2. Engagement with technology includes autonomy in making technology-related choices (including the choice not to use or be used by technologies) and sharing the benefits of data collection and use.
3. Nondiscrimination includes the ability to challenge bias within, and to be free from discrimination in, big data algorithms.
While this framework for data justice provides a useful foundation for establishing ethical use of data analytics, it does not speak to the unique nature of higher education. As noted, higher education institutions have a legal duty of care. International professors and lecturers Paul Prinsloo and Sharon Slade have been pioneering research on the ethical use of analytics in higher education. They argue that care should be extended to the collection, storage, and use of student data. They also contend that care is required because current policies and data justice initiatives alone do not take into account the specific contexts within higher education or the complexities of individual students. Care-based use of analytics should incorporate codes of practice and pillars of justice from a student-centered perspective.
Guidelines for Ethical Use
Data justice, care, and associated codes of practice combine to form a touchstone for grounding policies, processes, and practices of ethical analytics-informed decision making. From enrollment planning, to benchmarking, to strategic initiatives, to student support, codes of practice that incorporate the principles of data justice and care can be used as a framework for centering students in data analytics processes, unearthing the contextual complexities that influence data-informed decision making, and providing a useful starting point for conversations about ways to improve data-informed processes.
We recommend the following guidelines for approaching data analytics from a more ethical and equitable perspective:
Consider context. Data are invaluable for facilitating efficient and effective solutions to many challenges campuses face. However, campuses are as varied as their students. Each campus has its own mission and priorities, and students bring their own myriad experiences and goals. These contexts matter—and data are only as valuable as the meanings campus leaders derive from them.
Campus audits that are focused on understanding an institution’s readiness, capacity, and unique culture are helpful ways to better align analytics with an institution’s needs and priorities. Analytics and institutional alignment are especially important when working with predictive or prescriptive data, as algorithms developed in one context will not easily translate to the unique context and needs of another environment. To help contextualize data, higher education should approach the purchase, use, and implementation of analytics tools with the mission and priorities of the institution and the specific needs, goals, and experiences of students top of mind.
Collaborate broadly. Improved collaboration across higher education divisions and departments can help focus attention on student needs and interests. At the University of Georgia, Athens, information technology and institutional research teams are collaborating in an effort to understand data analytics from multiple perspectives, and other institutions have similarly combined their business intelligence and institutional research teams. By pooling expertise from a variety of campus stakeholders, colleges and universities can derive a more complex and nuanced understanding of analytics and implement more holistic interventions and responses to those data. Such an approach has great potential in higher education and encourages institutions to go even further in engaging stakeholders across campus in analytics initiatives.
An ethical approach based on data justice and data care would extend these collaborations beyond the better integration of various offices to include faculty, adviser, and student users of these tools—campus members who are often absent from analytics development and decision making. Various studies have indicated that data analytics and tools are more likely to be used inclusively when a broad group of stakeholders (via formal councils or meetings) is involved in their development and implementation. More importantly, collaborations that include data producers in addition to data users are more likely to generate data interpretations of greater relevance to the specific contexts, priorities, and goals of their organization.
Inclusive data governance models are not a new idea, but they remain an underutilized strategy. Rio Salado College, Tempe, Ariz.; University of Michigan, Ann Arbor; and Georgia Gwinnett College, Lawrenceville, all have good higher education models for leveraging the perspectives of faculty, advisers, and students in data policy development.
Push for transparency and open source systems. When purchasing proprietary, vendor-based analytics systems, it is essential to understand which variables comprise analytics tool algorithms, how those algorithms are constructed, and how resulting data interventions are determined. Higher education institutions can use their considerable purchasing power to push for greater transparency from outside analytics vendors.
In addition to advocating for transparent data from vendors, colleges and universities should consider adopting and adapting open source systems, which give institutions rights and full access for researching, modifying, and sharing system data. Transparent and open systems result in a better opportunity to understand institutional and student measures and to ensure ethical use of data.
Furthermore, institutions must be transparent about how data are collected, stored, used, and shared. Through easily accessible policies and communications, students should be able to understand what the institution knows about them and how it uses that knowledge. To improve equity, students should also have a level of control over providing consent for data use, management, and sharing, and the ability to opt out of analytics-based systems.
Refine policies to address access, security, and privacy issues. As analytics technologies pull vast amounts of data from a variety of sources and centralize those data into a single system, proper authorization and verification policies and processes, at a minimum, are essential for interaction with these systems. Equally important is a commitment to establishing the human and capital resources necessary to guard these data against breaches, to ensure student privacy, and to establish rules for accessing and using data analytics.
Beyond simply purchasing and deploying analytics tools, colleges and universities must work with vendors and their own information technologists to ensure data security and clarity regarding data-use processes. This clarity must extend to student access and consent. Arguably, students—being producers of the data that are used to improve institutional outcomes, in addition to their own individual outcomes—should have full access to their data and collaborate in the data consent and collection process. Moreover, students should have the right to consent to how their data will be used and for what purposes.
Address data inequities. Ethical and equitable use of data analytics requires that users address the structural, organizational, and individual inequities that exist in higher education, which can be exacerbated through the use of analytics and their algorithms. Auditing current policies and procedures and creating new ones where needed, along with developing communications related to ethical and equitable analytics use, is a good place to start.
Another good starting point is to provide training and development for all campus members regarding equitable analytics use, as well as implicit bias within postsecondary data and institutions. Furthermore, students—and the faculty and advisers interacting with them—should understand the predictive and prescriptive nature of many analytics technologies and how to use that data. They should also have a path for questioning, disputing, or remediating predictive or prescriptive interventions or privacy violations.
An Absolute Necessity
Data analytics offers great potential to improve organizational outcomes for higher education. Through real-time, visualized, and diverse measures, a more complex and a clearer view of institutional efficiencies can be realized. However, the collection and use of these data must exist within an ethical framework that leverages strategies for data justice, care, and codes of practice.
Ethical analytics policies and practices can help mitigate potential privacy and ethics violations and can provide the context and data-informed decision making needed to create a more holistic picture of higher education’s future. Ethical use of data analytics is not only the right thing to do, it also reminds us of why we do the work that we do. As members of a campus community, business officers and campus leaders are tasked with creating an institution of learning that helps students—who are at the heart of our work—to succeed.
CARRIE KLEIN is a Ph.D. candidate and research assistant in the higher education program, George Mason University, Fairfax, Va. MICHAEL BROWN is an assistant professor in higher education and student affairs in the school of education, college of human sciences, Iowa State University, Ames.