Applied Data Science – Machine Learning on Graphs: Link Prediction in Social Networks
Rita's master's thesis and publication focuses on know-how development for the company Beekeeper together with one of the largest airports in Europe. The research focus is on optimizing communication between employees and management and the operational processes. How accurate is machine learning in extracting and predicting employee relationships? Find out, enjoy reading.
A Case of Know-How Development at Beekeeper
Rita Koranyi – Applied Data Science student HSLU
Paper/Publication: GDPR-Compliant Social Network Link Prediction in a Graph DBMS – The Case of Know-How Development at Beekeeper
Poster: Applied Machine Learning Days 2022
Can you first tell us a bit about your background?
After nine years with a software company, I decided to make some changes and thus signed up for the Master of Science in Applied Information and Data Science in February 2020. After doing many lessons at home, I started on my thesis, together with the company Beekeeper. After all, I really wanted to do a project with graph databases and graph machine learning.
Now regarding your project: tell us about it
My master’s thesis title is Machine Learning on Graphs: Link Prediction in Social Networks.
In this study, the research of user interaction data in a graph database was compared with graph machine learning algorithms for extracting and predicting network patterns among the users.
The project was realized together with Beekeeper. Beekeeper is an enterprise social network, that empowers companies to close communication gaps and improve collaboration between frontline workers and management. Beekeeper is a tool is not only built for communications, but also for optimizing company operations such as shifts, internal documents and in general internal communication management. They have been among the pioneers in the enterprise social networks as a service field and they have very important key companies in their portfolio. Beekeeper provided me with an anonymized dataset. We wanted to learn more about whether this data, in the form of graphs, could be used for machine learning.
My research focus lied on optimizing their communications and operations and the project focused on one of the largest key customers in the transportation industry in Europe. Specifically, the use case was, to find out how accurate machine learning is, when it comes to extracting and predicting employee relationships. A lot of information gets lost through anonymization, but data privacy is a very important aspect of managing and protecting personal data.
The amount of available information in the digital world contains massive amounts of data, far more than people can consume. Beekeeper AG provides a GDPR (General Data Protection Regulation) – compliant platform for frontline employees. They typically do not have permanent access to digital information. Finding relevant information within this platform to perform their job requires efficient, filtering principles to reduce the time spent on searching, thus saving work hours. However, with GDPR, it is not always possible to observe user identification and content. Therefore, this paper proposes link prediction in a graph structure as an alternative to presenting the information based on GDPR data.
Result & Findings
The results showed that, although the accuracy of the models was below expectations, the know-how developed during the process could generate valuable technical and business insights for Beekeeper AG.
What data did you use, what methods did you apply, and what insights have you gained or hope to gain?
Beekeeper provided me with an anonymized dataset because we wanted to learn more about whether this data, in the form of graphs, could be used for machine learning. Specifically, the use case was to find out how accurate machine learning is when it comes to extracting and predicting employee relationships. A lot of information gets lost through anonymization, but data privacy is a very important aspect of managing data.
How can your findings be used to help society?
When we validated the results we could clearly see, among other things, that chat requests tend to stress and distract people at work. I think such insights can add value for a company when designing their workplaces and optimizing their processes. It could be exciting to use the findings for designing the physical spaces as well as the virtual settings in which employees communicate. It’s important to put the emphasis on the “how” rather than on the “what.”😉
Article Publication and Poster
I had lots of fun working on this project. That was also a very important factor for my supervisors (Jose Mancera and Prof. Dr. Michael Kaufmann) and me to submit an article, based on my MA thesis, for the journal of the Multidisciplinary Digital Publishing Institute (MDPI). MDPI is a pioneer in scholarly open access publishing. I had never thought about writing an article to publish before, and I remember being very excited about it and about the “Let’s do it!” in the last sentence of the kick-off call.
I felt so motivated and thus also signed up for a poster presentation for the Applied Machine Learning Days 2022 (AMLD). I was so happy when I got an email telling me, that my topic had been selected! #hooray
How are you looking to develop your project further?
I spoke with Jose Mancera these days about what had happened to my project at Beekeeper. My research was a pioneer topic at the company. I hope that other students might continue as a thesis or a project work at Beekeeper and improve my models or add different perspectives. I believe graph machine learning and graph databases will get more and more attention in the industry, therefore I will keep myself further up to date on this topic. 😊
How did your studies (MSc in Applied Information and Data Science) influence the project?
I think, quite frankly, that none of this would have been possible without the programme. Otherwise, I would not have had the personal contacts and would not have been able to publish the article. The Head of Programme Prof. Dr. Andreas Brandenberg also supported me at the Applied Machine Learning Days conference, which took place after I graduated.
What advice would you give others who are starting a project?
I think enthusiasm for the topic is the most important thing. You can learn anything … 😊
And finally: What new hashtag are you aiming for in 2023?
We would like to thank Rita Korányi and her supervisors for their dedication and time to share with us this wonderful Research Project Portrait.
We want to thank Rita Koranyi for sharing this super interesting project portrait with us!