Co-founder of fast.ai and named by Forbes magazine as one of ‘20 Incredible Women in AI’, Rachel Thomas talks data ethics, bias in machine learning and a lifetime of tearing down barriers

 

Rachel Thomas is an American computer scientist and founding director of the Center for Applied Data Ethics at the University of San Francisco. Together with Jeremy Howard, she is co-founder of fast.ai, which created the ‘Practical Deep Learning for Coders’ course that over 200,000 students have taken, and which focuses on students from diverse backgrounds, with small datasets, and with little computational power.  Rachel earned her math PhD at Duke, and was an early engineer at Uber. A former software engineer, she is a popular writer and keynote speaker on topics of data ethics, AI accessibility, and bias in machine learning.

 

What initially sparked your interest in data ethics?

I’ve been interested in issues concerning justice and society since I was a teenager. I attended a racially diverse high school and could see and feel the disparity between mine and wealthier public schools that were less diverse and had more resources.

Concern about inequality was part of my life through my volunteer work and activism, but for a long time it felt separate from my schoolwork and professional goals, since I was studying theoretical math. I didn’t focus on data ethics until I began working as a data scientist and software engineer – roles that in recent years have had a huge societal impact.

Did you ever consider a different career?

As a child I wanted to be an archaeologist, marine biologist, or a writer. As I got older, my goal was to become a math professor. I loved math and had a great time in undergrad. When I did a PhD in mathematics, I was turned off by all the sexism and toxicity I saw in academia, so I went into the tech industry instead. I experienced the same kind of toxic working environment and that was when I moved towards working on issues of data ethics, inclusion and diversifying AI.

Are you frustrated by the inequality and sexism issues in AI? What’s the best way to address them?

It angers and concerns me. We saw it quite recently when Google fired Dr. Timnit Gebru. Seeing how a Black female scientist and a global AI expert was treated shows how hostile the field of AI can be to people who are not part of the majority.

We also know there are problems in products like the software that’s biased against darker skin individuals when it predicts who will commit crimes. I also wrote about how racial bias affects medicine and how certain patients are deprived of particular diagnoses and treatments.

There’s no simple answer but there are two key things we should be thinking about – participants and power. Who are we developing these products for? Medicine is all about patients and physicians so we need to involve both groups when developing an AI solution.

We also need to think about how these AI systems shift power. Do they allow people to express that these solutions may be incorrect? If something goes wrong, how quickly can we highlight errors and make amendments?

Do you think teaching people to know more about AI and machine learning will reduce issues of inequality, bias and racism?

 It’s part of the process but it’s not sufficient on its own. There needs to be a more active process of addressing these issues. A study found 40% of women working in the tech industry leave, compared to just 17% of men. Too often, tech companies and research labs end up looking like revolving doors.

So it’s not just about getting more women and people of color coming into the industry but ensuring they don’t leave after being mistreated and disempowered at work. I also think there are broader issues like how powerful these tech companies are. For example, creating an algorithmic system that can make life-impacting decisions such as whether one will be hired by a company or be eligible for a loan.

We definitely need a greater literacy of AI in our society so that people can recognize what’s wrong with some of these AI algorithms and differentiate them from the real, amazing advances that AI can bring.

Do you think people are building AI that they want rather than the AI that people need?

People working in data ethics point out how important it is to ask, “Is this even a technology we should be building at all?” I agree that not all things should be built. A good example is identifying somebody’s gender from their pictures, which is inherently going to be anti-trans and anti-non-binary. It goes against our understanding that gender and sexuality are things that people determine for themselves. Having a technology that identifies a person’s gender or sexuality not only fails to create any positivity but it’s leads to harms and errors.

Can you envisage a world where everyone understands and is capable of building machine learning models?

I think in the future, we are going to talk about AI literacy in the same way we talk about literacy in terms of reading a text or numeric literacy. People need to be able to evaluate AI products that are being sold to them, some of which are snake oil. In the US, particularly, the government often doesn’t have the expertise to evaluate technologies that are being pitched to them. There are many AI products sold by private companies that are not backed by science, like predictive policing tools that claim to identify future criminals. There are a lot of risks and harms in deploying such algorithms in society, and often, the people who are being sold these products do not have the know-how to evaluate them.

How challenging it is to teach AI to someone without higher mathematics or programming knowledge?

Together with Jeremy Howard, our long-term vision for fast.ai is to build tools so that people don’t need to learn how to code. But the technology is not there yet, and for now, people do need to know how to code to build deep learning models. However, everyone should be involved in deciding policy and governance questions around the use of AI in society, regardless of whether they know how to code (and knowledge of the humanities and social sciences is incredibly important in considering these questions). My data ethics course, which talks about bias, disinformation, surveillance and venture capital ecosystem has no prerequisites.

What’s the greatest professional challenge you’ve overcome?

There is a cumulative exhaustion that builds up from years or even decades of dealing with sexism and toxicity. That has been challenging, but I am very happy with where I am now.

What do you consider your greatest achievement?

I am so proud of fast.ai. Back in 2013, I was trying to pivot from mathematics to deep learning and I just found the field to be so cliquish. If you hadn’t gone to grad school and studied under certain professors, it was just so hard to access the knowledge of AI and deep learning. Nobody was writing down the practical knowledge and people were often quite unwelcoming about it. I’m proud of fast.ai because we tried to open it up to a more diverse group and make available the practical things you need to know about writing code.

Also, I am now the Founding Director of the Center for Applied Data Ethics at the University of San Francisco, which started a year and a half ago. We are focusing on harms of algorithmic systems that are happening right now to real people. Surveillance, disinformation, bias: I feel like these are some of the most important topics we can be working on. We have tried to bring together a mix of people from the industry, non-profit organizations, the community, academia, and local government to work on these issues.

Who’s been your biggest influence?

My partner and co-Founder of fast.ai, Jeremy Howard. He is a very unconventional thinker, in a refreshing way. He is ambitious and confident when it comes to innovation. On top of that, he’s also a kind, thoughtful and very caring person.

Finally, how would you recommend people get started in AI and deep learning, without getting overwhelmed?

Natually, I’m biased towards fast.ai. We offer practical AI and deep learning courses for coders, with the only pre-requisite being one year of coding experience. All courses are free and have no ads. We’ve had several students who have gone on to secure jobs in the industry, write public papers or start their own companies.

In more general terms, I strongly recommend finding a resource that’s good enough and then sticking with it. Some students get stuck bouncing between different tutorials and courses, never completing anything. It’s also helpful to build your own project, but again, you have to stick with it. See it through to completion and don’t be tempted to start something else until it’s finished.