Videos from 2019 DataEdge Now Available
The Berkeley Center for New Media was thrilled to co-sponsor DataEdge, a School of Information conversation about the challenges and opportunities created by the rise of big data.
If you missed out, you can now view the videos from this fantastic program here and below!
The Platform Challenge: Balancing Safety, Privacy, and Freedom
Data Scientist Vs. Data Engineer: What's the Difference?
In this talk, we will demystify the sometimes perceived interchangeability between Data Scientists and Data Engineers. Both roles are distinct and critical to the success of any Big Data project. However, because there is a limited shared set of skills between the two fields, organizations and companies at times assign Data Scientists and Data Engineers the same tasks. This behavior can add significant risk to the success of any project. We will go over a few examples of how Data Scientists and Data Engineers work together to build a product.
Creating an Analytics Operating Model for Success
Regardless of whether you want to work for a start-up, a large company, or a non-profit, your employer’s “analytics operating model” will be a critical part of both expectations, and your ability to succeed. Are data scientists expected to develop their own pipelines? Do data engineers sit with line of business project teams, or with IT peers? Are there documented developer best practices and how strictly are these enforced? Attendees will be provided the opportunity to develop an understanding of common analytics operating models, and how their personal skills and working style may match or clash with such models.
Data for Good: What Could Be Better?
As the data field matures, there are increasing opportunities for data scientists to look beyond the typical places where data roles proliferate, like tech startups, financial firms, or health care companies, towards problems with a clear social focus. But, are these areas really all that different? This talk will explore the false distinction between those working in "corporate data science" and those doing "data for good" and describe how all data practitioners can engage with the social challenges built into their work. In particular, we'll focus on the importance of power dynamics, co-creation, and mindful decision-making in navigating the social impacts of data science projects. We'll also highlight some of the unique challenges posed by data scientists working in the social sector and cover some tips for anyone considering making a shift towards the social side of the data spectrum themselves.
Understanding Your Machine Learning Model: Black Box No More
Machine learning techniques often get the bad reputation of being black box methods. This session busts that myth by demonstrating how interpretability tools can give more confidence in a machine learning model and also helps to improve the insights it generates. This talk will cover best practices for using techniques such as feature importance, partial dependence, and explanation approaches. Along the way, we will consider different issues that may affect model interpretation and performance.
How to Scale AI-Led Analytics
Fraud Detection at Wells Fargo
In this talk, we will discuss the process of building a sophisticated fraud model at a large, highly regulated financial services company. It will also touch on the approach we used in building a real-time model in a big data environment, deploying that model to production, and the work involved in maintaining and monitoring model performance over time.
The Perils of Data Journalism
Ethics and standards are critical considerations in the fields of journalism and data science, and certainly in the emerging discipline of data journalism. In this talk, panelists discuss the importance of these concerns and the obstacles and pitfalls that can mask or undermine truth and clarity in data-based reporting.