TGG Podcast #55: Sarah Rudd - Arsenal's analytics pioneer
Written by Training Ground Guru — September 14, 2023
FOR almost a decade, Sarah Rudd was Vice President of Analytics and Software Development for StatDNA and Arsenal, making her one of the most senior women in the Premier League.
She has now co-founded her own analytics consultancy called src ftbl along with husband Ravi Ramineni. In this episode of the TGG Podcast, Sarah told us about her work with Arsenal, her wider career and her thoughts on the future of analytics.
You can listen to the episode via the player below, or read an edited transcript after that.
1. START OF THE JOURNEY
Sarah Rudd: I knew I wanted to get into football analytics, but football analytics didn’t really exist back then. I had the pleasure of chatting with Mike Forde (the former Chelsea Director of Football Operations) at one of the early MIT Sloan Sports Analytics Conferences and explained my situation and what I wanted to do.
He gave me the advice, ‘Anybody can say they can analyse football data, but what football clubs really want is for somebody to show them what they can do.’
One year later, StatDNA is at the (MIT Sloan) Conference and they have this research paper competition where they will give you one season of Brazilian data and I thought, ‘This is the opportunity I’ve been looking for, where I can really show clubs what I can do.’
I took the data and built a Markhov Chain model, looking at what’s the value of the situation the player is currently in and where they moved the ball. Is the team more likely to score from that situation, or less likely? You can assign numbers to that, how much did they increase the probability of scoring or not.
Why that paper resonated with people is it’s something that’s quite useful in recruitment. What I was thinking was if I worked inside a club, what are the sorts of things I would want to know to make these decisions? Sometimes that is not necessarily what people want to read.
From there I got to present at NESIS, the New England Symposium in Sports, and got to chat with Jaeson Rosenfeld and he decided to offer me a job. That’s how my story comes full circle I guess.
2. STARTING WITH ARSENAL
Arsenal had been an exclusive client of StatDNA. People like Hendrik Almastadt and Ivan Gazidis saw the potential in something like this and the competitive advantage. They had signed StatDNA to an exclusive deal within the Premier League and realised there was an incredible advantage in the type of data we were collecting.
StatDNA wasn’t just an analytics company, we were also a data provider. We were collecting really rich context around the events, and they saw the advantage in having that and not letting their competitors have that. They decided it made sense to bootstrap their department by acquiring a company, which I think made a lot of sense.
Pre-acquisition (of StatDNA by Arsenal) was really focused on recruitment, because there were a lot of things they didn’t want an external service provider to see, around match preparation, and I think that makes a lot of sense.
Post-acquisition we were involved in a host of areas - a lot in recruitment but also a lot in terms of pre and post match preparation, team performance analysis which would then lead into recruitment and squad needs analysis, and even doing some work on the fitness and medical side, so trying to tackle some of the injury prevention issues and things like that.
(Arsene Wenger) is a very smart man, so he was critical of a lot of things (StatDNA did), questioning things. However, I’ve seen things in the press saying he never rated StatDNA. If you look at who’s working at Fifa with him, it’s Jaeson Rosenfeld, so clearly the two of them got along and enjoyed working with each other!
In those days Wenger was kind of the sole decision-maker (at Arsenal), so he would listen to everybody and then make his decision. How much is coming from the data side, how much from the scouting side, how much from him? It’s always difficult to say.
We worked closely with his backroom and coaching staff, so even if we weren’t in the room with him, there was certainly influence coming from various stakeholders around him. We had a really great relationship - and still do - with the performance analysis department. That’s one of the areas where a great working relationship between objective data and subjective analysis and influencing the coaching side through that.
The Headquarters (of StatDNA) were in Chicago. I was always in Seattle, there were people in Boston, Salt Lake, so we were kind of spread out all over the place. It made things difficult, because having that personal relationship with people is important, but the upside of it was that football training grounds are incredibly distracting places to do work.
Being physically removed from that had the advantage that we could work in isolation and do some of the deep thinking we need to do. We actually had a house near the (Arsenal) training ground, so there would be a rotating cast of people from StatDNA staying there. It probably wouldn’t be more than a week or two when there wasn’t someone from StatDNA staying there.
3. GETTING BUY-IN FROM THE COACHING STAFF
That’s a huge part. If you don’t get trust and buy then you are just working in a bubble and might as well be sat at a University working in the proverbial ivory tower.
One of the methods we used to get that buy in was to link everything to video, so we are not talking about models, we are talking about football on the pitch.
You can sit down with a member of staff, like a Steve Bould, and watch 20 video clips and say, ‘This is what the model is saying here, what do you see?’ And he’ll say, ‘I disagree with this, are you taking into account these factors?’ And these factors would turn into features for the next iteration of the model.
When you operate like that, he feels ownership over the model. That’s a really important aspect I maybe didn’t appreciate early on. With my husband (Ravi Ramineni), at Seattle, he was sat in the coaches’ office all the time, they would have those conversations and then it's a lot of buy-in, because they say, ‘This guy really understands the game, we speak the same language.’
You also need to make sure the message is agreed upon between performance and data analysis. You don’t want to contradict each other and then the coaching staff has to say who do I believe. You can disagree but eventually that has to be resolved and do that prior to going to the coaching staff.
4. ARRIVAL OF UNAI EMERY AND GETTING THE DOSAGE RIGHT
(In an interview with TGG in August 2021, Unai Emery's assistant Victor Manas said they had found the amount of data they were getting from StatDNA 'a bit overwhelming.')
I think that’s a fair criticism. The term I heard earlier this week is the right dosage - so figuring out what’s the right dosage for each individual. When you have a coaching staff change like we went through, it can be quite difficult to find the right dosage. Everybody previously, we had been working with them for six, seven years, so over time that dosage increases - particularly for things like pre-match opposition reports.
At the start of the season it starts very small and over the course of the season it’s like, ‘I have a question about this, can you add this in,’ and it grows and grows.
You’re used to it, so it’s not an issue. Emery’s staff had a different set-up at PSG, so it’s different metrics, different terminology, different language. I think we got the dosage wrong and probably needed to peel it back a little.
You want to present the data in a way that’s actionable, but you are dealing with people who are so detail-oriented and have such a nuanced way of viewing the game and you also need to encapsulate that. Finding that balance can be really difficult.
Another member of Emery’s staff, Javi Garcia, he’s the Goalkeeper Coach, absolutely love him, he just wanted everything, so it just comes down to the individual, what their workload is, what their working style is, what their personality is. I don’t think there is a right or wrong way. We just need to get better at understanding those individual personalities and tailor things.
Sometimes we were sending out reports that are going out to Director of Football, coaching staff, performance analysis - it has to be three different reports, because it’s three different audiences.
One thing, particularly as you get to larger clubs, the number of individuals that coaching staff are interacting with can explode and become huge. As a way of protecting them from information overload, if you can minimise the number of interactions, I think that’s quite optimal.
5. BALANCE BETWEEN HUMAN EYES AND DATA IN SCOUTING
I think it definitely changed over the years. Early on, it was working separately and then fighting it out at the end. Then it was a shift to, 'We will bubble up some targets and do due diligence on the final list. Now it’s much more collaborative and integrated and is working quite well.
StatDNA is fortunate in that they can go out and collect data on any player in the world that has video - and then trying to fill in any blindspots, whether it be through scouting or data or ideally both.
I don’t know if this has been highlighted quite as much in the press, but a big part of that is shifting people like Mark Curtis and Ben Knapper into recruitment roles. They both come from performance analysis backgrounds, so they are people who have basically been working with data their whole careers.
Ben was there from day one of StatDNA, so he is incredibly well versed in using data to analyse the game and now using it in a recruitment context, so I think that was a clever shift in terms of the personnel doing the scouting. He focuses on loans, but I think he’s also involved in recruitment, particularly within the UK.
6. MIKHAIL ZHILKIN AND DATA SCIENCE FOR PERFORMANCE
Mikhail's role opened up because we did not have the bandwidth to support all those areas. The human performance department said, 'We really need a dedicated person who can sit in the office with us and help us answer these questions.' Doing things like getting all the GPS data off of somebody’s laptop that they had probably discarded.
Then Mikhail Zhilkin came in and has really pushed things on. It’s been good having a dedicated person for that.
7. LEAVING ARSENAL
There were a number of different reasons for the change, but a big part was that I had been there for close to 10 years. There is not a lot of turnover of staff at Arsenal - or at least there hadn’t been up to that point - so I found myself working one club, one league, with similar staff. I loved the people I was working with, but you get to the point where you’re saying, 'Am i growing as much as I should be and do I need a change of scenery to push myself on?'
It was a difficult decision to leave and to decide what next. You don’t necessarily want to go and work at another club, because would you just be doing the same thing as at Arsenal but starting from zero?
8. WOMAN IN A MALE-DOMINATED SPORT
(Were you ever treated differently or doubted because you're a woman in a male-dominated sport?)
No. At Arsenal you’re working with the best of the best so it was never really an issue. Everybody knew me, respected me, so there weren’t any issues (about being a woman). Probably more because I’m American! People always question, ‘How much do you know about football, you’re an American.’
When you leave and take a step back, you’re like, 'There aren’t many women in this field in general.'
Last year I attended the inaugural Women in Sports Data Conference in New York. That was a groundbreaking moment for me, because for so long I felt, 'I'm the only one,' but I spent a Saturday afternoon in a gymnasium full of women interested in sports, data and technology and you realise things are really changing.
There are quite a lot of women in senior roles in other US sports. I didn’t have anybody to look up to, so hopefully I can be someone they can model and say this is a viable career path.
I try to spend as much time as I can speaking to students, encouraging them, telling them about different things they can do to get into the industry. I’ve taken part in a number of initiatives. There was one during the pandemic, basically office hours for people who were under-represented in the sports industry, so I would set some time up on my calendar and people could schedule a 30-minute call with me.
I actually ended up hiring one of those people, at the place where I was between Arsenal and src ftbl. Its beneficial for all parties, initiatives like that.
Susana Ferreras is at the training ground in Colney and also does work with the Spanish Women’s Basketball Team. When we were hiring for the position that Mikhail Zhilkin got, we only had two women applicants. One was Susanna. She had won a silver medal for Spain at the Olympics (in basketball). The other was a woman working in Formula One. Unfortunately she had to drop out.
It was really shocking to see how few female applicants there were. Susanna was so impressive that we decided to shuffle things and make a position for her. Behind the scenes I had a female software engineer with me in Seattle called Shauna Storey. There were three of us, along with Tyler Cox, who is now at the Seattle Sounders.
9. WHO'S THE BEST AT DATA ANALYTICS IN THE PREMIER LEAGUE?
I would still say very few clubs are doing it well, in my experience and my husband’s - he was at the Seattle Sounders for about 10 years, had a much smaller budget than I did, won two titles and they were in the final four out of give years.
When we talk about clubs doing analytics well, there’s the data science side and then there’s the implementation side. Who can influence decision making? Which clubs have good decision making processes? I think the combination of those two is quite rare.
And what we see is that when you are successful with that, it is very difficult to sustain. Certain clubs might be possibly moving away from what made them successful, or the conditions that made them successful have changed, so they have to do something different.
We all know football clubs have massive egos, so everybody thinks success was due to them and so the balance of power tends to shift a little bit. Markets change, other clubs catch up, so it’s really difficult to sustain.
I think there’s a large number of clubs that are getting started and I think they are doing good, interesting work. Outside England and the United States, the rest of the world is still lagging quite a bit and there is a lot of opportunity there for clubs to gain a competitive advantage.
Nobody ever really knows what is going on behind the scenes, but I think you can reverse engineer signings a bit. Certainly Brighton and Brentford are well up there, even though they are orovayl the most secretive of the bunch. Their transfer business has been phenomenal for the last couple of years.
Its a really smart set-up where you make this investment in analytics that serves two purposes. It’s going to be a little bit tailored to one or the other- analytics you would do for betting is quite different from what you would do for recruitment - but that foundation is there.
Certainly anything involving proprietary data collection is going to be really valuable on both sides of the business.
10. DIFFERENCE BETWEEN CLUB AND TWITTER ANALYTICS
Twitter is for fun, Twitter is entertainment for a lot of people. People like to have debates about who is better or why is this team good and why does this team fall off a cliff.
Some of that stuff is not necessarily actionable or how a club would approach it, so there is a huge gulf. Twelve years ago, when I got hired into StatDNA, I was absolutely blown away at what they were already doing.
They already had a pass value model, they just implemented it in a different way, so there is also that huge gulf of what is happening in the public sphere versus what is going on behind the scenes internally at a club, where you are thinking about these things full time rather than just on a Saturday or in your evenings.
The early part of my career was trying things and seeing why they don’t work. Early on I had football analytics solved! I was absolutely certain - just throw all the numbers into a neural network and your work is done! I learned so much by trial and error and I think that’s really useful.
That was my first model but it didn’t have a good name! This was me being terrible at personal marketing. I don’t even remember what I titled that paper or even if I had a name for that metric, but it was something horrendous and terrible and then Karun Singh comes along and names a similar concept expected threat and I think people gravitated to that, because it made more sense, where you have expected goals and then you have expected threat.
The implementation of what he was doing is quite different from what I was doing, but he did a great job naming it. You can go around to people and say ‘expected threat’ and people understand what that means.
There is certainly this culture, or was, of making a name for yourself on Twitter and certainly part of it is having a catchy name for your metrics. Maybe it has gone a little bit too far.
You don’t want to have the industry full of acronyms that don’t really make sense. You want to have things tied down a little more to footballing concepts, but even that is difficult, because each country has their own footballing language.
We are still in an age where we need to get the basics right. I see a lot of bad stuff out there still, where people are assuming a number means something when it doesn’t. We have to nail the basics still.
11. UNTAPPED POTENTIAL OF TRACKING DATA
In terms of broad areas of questioning that I think are going to be solved soon, certainly off-ball movement. A lot of questions around the impact of team-mates, so how much are your receptions in behind based on your movement versus the ability of a team-mate to find you.
The big area that is still going to be quite difficult to answer is around defensive abiultyies. That is something you can scratch the surface with tracking data, but in terms of figuring out why Virgil van Dijk is so good, that is going to be a difficult one.
There is also this whole new world of questions around how can we get more certainty around a player’s ability to translate from one league to another ad I think TD will be really helpful for that, in terms of the physical context and also team shape, team dynamics, what kind of defensive situations are you finding yourself in.
Space is a big part of it and decision making is a big part of it. Neither is done in isolation. Being able to figure out what’s because of you versus what’s because of the situation and your team-mates, that’s the difficult part still.
12. NEED FOR MORE SHARING
I joined Stat DNA January 2012. Liverpool’s department started around then, maybe a year or two before. But because it was so cutting-edge, we didn’t talk to anyone. We went on our path, Liverpool went on theirs, City have done their thing, which has made it quite difficult.
It’s something I have never been happy with. Arsenal were quite secretive, they wanted to keep everything in-house. I love chatting with people, I love chatting with young people. I think that’s one of the nice things about my current situation - i can be a little more open. I love seeing the work people are doing and maybe provide constructive feedback on some things.
You start wondering what else are people doing that we haven’t thought of? Everybody wants to protect their competitive advantage, but I think, for the good of the industry, it’s good to share.
Solving the same boring problems as everybody else; sometimes it’s like, ‘I wish someone had just written a library and we could use that for the boring stuff and spend our time on the interesting stuff.’
13. NEW VENTURE
About eight months ago, my husband Ravi Ramineni and myself, as well as our business partner Cole Grossman, decided to set up a company called src ftbl.
We are a small football analytics consultancy company. We are trying to help out clubs that are in the early stage of their analytics journey and turbo boost what they are doing. Typically clubs that maybe don’t have an analytics department or maybe they’ve hired one data scientist but aren’t really sure how to integrate them into the decision making process. That’s our target audience with this company. We want to help get them started and eventually get them to be self sufficient
Every club sees a competitive advantage in this, so it makes sense to eventually move things in house. We help them get to that stage and when they do, the idea is we shift over to working on some of the strategic research questions that people in a football club would love to answer but never really have the time to do so.
The early stage of analytics was do everything in house and then you develop things in a bubble. Certainly when I was at Arsenal, things were quite secretive, we were working in isolation.
Now, people are seeing there’s a lot of benefit in understanding what’s a third party opinion on how else to do this, what else is out there. A lot of clubs find themselves behind, so collaborating with a third party is a greta way to catch up, rather than trying to build everything yourself, because its really hard to make up that ground if you’re having to reinvent the wheel from zero.
No football club can really leverage tracking data properly, so that’s one of the big areas of research that we are looking into, particularly with the proliferation of broadcast tracking data, so you know you can use tracking data to answer a lot of questions around recruitment, which you couldn’t in the past.
So I think that is going to be a heavy area of focus for us, as well as player development, I think that’s one area that a lot of clubs are very interested in now. The market is too expensive for a lot of clubs to go out and look for players, so a lot of clubs are coming to us and asking how can we develop talent in house more effectively and more efficiently.
14. FUTURE OF ANALYTICS AND THE IMPACT OF AI
We are already seeing huge impacts on us. All of the broadcast tracking data is built off deep learning models, object recognition and being able to work out the homography of those images. So it’s been impacting us for a couple of years without us really noticing it.
Just in terms of day-to-day productivity, I rely heavily on ChatGPT to do things that I’m not an expert in or not necessarily good at. There are certain ways I use it to make me more productive. But I don’t think we are at the point where you can have ChatGPT do all the work for you.
A fun way I entertain myself is have ChatGPT write scouting reports for different players. They sound really realistic but they’re kind of full of lies. That’s one of the things we have to be careful with, that is just makes up a lot of stuff still.
People’s jobs will change and certainly some individuals will be impacted by it. ChatGPT can write a scouting report but do I want it replacing my scouts? Absolutely not. Most people, your job will be safe or altered, hopefully for the best, where you’re not having to do a lot of the repetitive work.