Advancing data and knowledge management to promote equality and fairness
Since the rise of the Internet nearly 50 years ago, society has found both empowerment and dependence on the technological world and information sciences. Today, scientists cannot accurately test their hypotheses without the help of computer simulation and data analysis, companies cannot market their products without understanding user preferences, and ordinary citizens cannot go about their everyday lives without relying on personal and social information management and analysis platforms. Dr. Julia Stoyanovich, Assistant Professor of Computer Science at Drexel University, studies data and knowledge management, with the goal of helping users identify relevant information, transform that information into knowledge, and assess the fairness of data collection and analysis processes. By addressing challenges that arise particularly in scientific applications and in personal/social information management, Dr. Stoyanovich helps both scientists and everyday users work with their information more effectively.
Upon graduating from college, Dr. Stoyanovich spent five years in the startup industry, as a software developer, data architect and database administrator. This experience has motivated her to work with real datasets whenever possible, and to deliver results of her research to the communities of target users, as part of open-source systems or as stand-alone prototypes. Dr. Stoyanovich enjoys collaborative research, and is working with researchers at Technion (Israel), INRIA (France) and University of Pennsylvania.
Current areas of research include:
- Making Data Fair: Access to open data and to results of data analysis should be a basic human right, as indispensable and unquestionable as access to clean water. The “big data” wave brings incredible promise of improving people’s lives, accelerating scientific discovery and innovation, and bringing about positive societal change. Yet, if not used responsibly, big data technology can propel economic inequality, destabilize markets and affirm systemic bias. While the potential opportunity of big data techniques is well-accepted, the importance of using these techniques in a principled, balanced and fair manner is discussed less often. Improving big data fairness requires a comprehensive solution that involves three dimensions - technology, education and policy.
- Technology: It is incorrect to assume that insights gained from data using computational processes are objective simply because they were gathered automatically. Both data collection and data analysis may reflect the biases of their designers. What technological advances will allow us to reason about the fairness of a dataset or a data analysis method?
- Education: What is the core set of skills and competencies that citizens should possess to be considered data-literate? These skills would allow a person to critically judge data collection and analysis processes, and correctly interpret results, helping make informed decisions about, e.g., the risk / benefit trade-offs of vaccines, or the relative importance of government investment in foreign aid vs. military spending.
- Policy: What is the role of government bodies in ensuring fair access to data and data analysis, to data-literacy education for the population, and to regulating the disclosure of data collection and analysis methodology in consumer-facing data products.
In this project Dr. Stoyanovich and her team make coordinated advances along these dimensions, towards the goal of making data fair and available to all.
- Analyzing Preference Data: Preferences are orders among a collection of items attributed to a population of judges. Preference data comes in a variety of forms, such as ranked lists and pairwise comparisons, and is ubiquitous in a plethora of applications across different domains. Over the past decade, there has been a sharp increase in the volume of preference data and in the diversity of applications that use it. Examples of applications include rank aggregation in genomic data analysis, management of votes in elections, and recommendation systems in e-commerce. The goal of this project is to streamline the management and analysis of preference data, supporting computational and data scientists who work with preference data.
Bio
Julia Stoyanovich was raised in Moscow, Russia and Belgrade, Serbia. Her grandfather, who was a programmer for the Soviet Space Program in the 1950s-60s and participated in the launch of the Sputnik, introduced her to mathematics and algorithmic thinking at a very early age. Julia has many fond memories of her grandfather describing to her the state-of-the-art computational environments of his day, where memory was measured in cubic meters and programming languages were very low-level. Yet, the computational tasks, e.g., calculating the trajectory of a spaceship, rivaled modern tasks in complexity and were accomplished successfully.
After graduating from college, Julia spent five years in the startup industry, as a software developer, data architect and database administrator. This experience has motivated her to work with real datasets whenever possible, and to deliver results of her research to the communities of target users, as part of open-source systems or as stand-alone prototypes.
Dr. Stoyanovich often works with women and other minorities and underrepresented groups. In her experience, having female role models is a strong motivation for women to enter computer science and become successful in this predominantly male field.
Outside of her research, Dr. Stoyanovich likes to read, cook and spend time with her family. She is also slowly writing a novel about the passage of time. Julia commutes to work, which is in Philadelphia, PA, from New York City. Other than English, she speaks five languages including Russian, Serbian, German, Italian, and a bisl Yiddish.
For more information, visitwww.stoyanovich.org
Publications
Awards
Google Research Award, 2012
"Identifying ranked agreement among raters"
NSF CRA Computing Innovations Fellowship, 2009-2011
"Data exploration in biological repositories"
Michelman Award, 2008
for service to the Computer Science Department, Columbia University
DuBois Undergraduate Research Scholarship, 1997
University of Massachusetts, Amherst
Patents
U.S. Patent No. 7,958,113: "Automatically and Adaptively Determining Execution Plans for Queries with Parameter Markers."
Wei Fan, Guy Lohman, Volker Markl, Nimrod Megiddo, Jun Rao, David Simmen, Julia Stoyanovich. June 7, 2011. Assignee: IBM.
U.S. Patent No. 8,073,794: "Social Behavior Analysis and Inferring Social Networks for a Recommendation System"
Sihem Amer-Yahia, Evgeniy Gabrilovich, Bo Pang, Julia Stoyanovich, Cong Yu. December 6, 2011. Assignee: Yahoo! Inc.