In October 2023, the Faculty of Religion and Theology at Vrije Universiteit Amsterdam organized two hackathon events for the Network Institute Research Visit project, titled “Live-mapping Religious Difference Online.” This project explores the interaction between digital society, religious texts online, and live-mapping.
The first event, “Hackathon Text-mining in the Age of Restrictions,” took place on October 26th and aimed to investigate digital humanities approaches related to religious topics involving theologians, anthropologists, and sociologists. Dr. Yusuf Çelik, an Assistant Professor at the Faculty of Religion and Theology, kicked off the hackathon. Dr. Çelik introduced the fundamentals of text-mining, focusing on scraping and APIS. In accordance wit
h the focus on API, he addressed the actions of X (formerly Twitter) to limit the API access for academic research. X was once a vital tool for academic research, offering valuable insights into internet trends. However, under Elon Musk’s ownership, X has shifted its focus towards monetization, making it increasingly challenging for researchers to access crucial data. Prior to Musk’s takeover, X’s API was highly regarded, enabling studies on various topics, including responses to weather disasters and misinformation prevention. Unfortunately, X ended free access to its API in February and introduced paid tiers in March. This has left scientists without a reliable alternative for studying human behavior, posing a significant challenge to ongoing research efforts. Unless X reverses its current trajectory, it could end an era for academic research using the platform.
Dr. Çelik encouraged participants to form teams and address specific problems, including how to circumvent rate limitations when scraping data from the platform in question, optimizing the use of the platform’s API while considering cost-efficiency and limitations, distilling geolocation information when it’s not specified in metadata, and exploring other social media outlets for research. Following the morning session, all participants were divided into groups to find technical solutions to these problems, aiming to promote digital humanities and support future scientific research and technological innovation.
During the morning presentation, PhD candidate Fan Li from the Eindhoven University of Technology raised significant concerns about the ethical aspects of text-mining, with a particular focus on data collection and social media scraping. She delved into the role of ethics in the realm of technical innovation research. She also contemplated its potential influence on the design process of technological innovations and the work of engineers and data scientists. Working alongside Clyde Missier and Dr. Srecko Koralija from Vrije Universiteit Amsterdam, they collectively examined the ethical considerations associated with text-mining in the context of social media during the hackathon. Two pivotal questions frequently arise concerning data usage in social media: “Who owns the data?” and “How is the data used?” These inquiries serve as the foundation for exploring ethical issues such as data ownership of social media users and the manipulative actions of companies like X, which restrict API access that is contributed by users.
One of the most prominent concerns related to social media data revolves around data ownership and the ongoing debate over whether such data should be regarded as public or private. Central to this debate is the notion that social media users generally agree to a set of terms and conditions when using various platforms. These terms and conditions often contain clauses detailing how one’s data may be accessed by third parties, including researchers. However, complications arise when X limits access to the data obtained from the platform and perceives the data as their exclusive property. This shift in data ownership from individuals to social media companies is not primarily aimed at regulating the market or technological advancements. Instead, it is often employed to manipulate the development and application of technology.
Social media platforms with business models akin to those of X and Facebook, which involve shareholders and dividends, can be held accountable for ethical considerations, as they are viewed as equal participants in the data-sharing ecosystem. Conversely, universities do not share this status. Academic institutions and universities primarily serve as knowledge-producing organizations, with their primary goal being the generation of knowledge for the benefit of society. Their main objectives are not profit-driven, and they don’t operate to maximize profits for shareholders, unlike those in commercial companies.
The fundamental approach for the academic community to address the ethical challenges related to data ownership and manipulation by social media is to utilize the data to enhance societal knowledge. In line with this approach, there should be a unique solution for universities, as their primary intent is to use the data for non-commercial purposes. We can consider creating an “AI and Data Analytics for Strategic Alliances” platform, serving as an academic nexus for texts, images, and videos. This platform would serve as an open and collaborative space, allowing researchers to responsibly access and use the data. Existing models, such as Nexis Uni, LexisNexis Legal & Professional, and products like Message from Talkwalker, which assess the impact of data across multiple languages and countries, can serve as sources of inspiration for the creation of such an “academic nexis uni.” This academic nexus uni would facilitate responsible data usage and research while preserving ethical standards and the non-commercial nature of academic research.
Ethical considerations in text-mining social media provide a pathway to foster responsible development and innovation. Embracing ethical guidelines builds trust among users, encouraging increased data sharing and participation in research. Transparency in data usage helps alleviate privacy concerns and promotes accountability. Adherence to ethical principles can lead to the establishment of best practices and industry standards, simplifying processes for researchers and organizations. Beyond this, ethical text-mining facilitates collaboration with social media platforms, universities, and regulatory bodies, fostering a deeper understanding of ethical complexities and the development of shared solutions. Ultimately, ethical text-mining not only safeguards user interests but also unlocks opportunities for responsible advancement and innovation in this field.