Why do visitors get to have all the fun? The Data & Insight team at the National Gallery set out not just to understand what audiences have come to the Gallery in the past, but, with machine learning methods, begin to predict what audiences might come in the future. This talk will look at how a forecasting model was developed for exhibition ticket sales and audience attendance using machine learning methods and will detail how it has since been used within the organisation. We will also explore how to find insights using machine learning on visitor feedback.
Senior Manager: Data & Insight
The National Gallery
The National Gallery
This presentation was filmed at the MuseumNext Digital Summit in Autumn 2019.
Tom : Hi, yeah, I’m Tom Cunningham from The National Gallery. I’m the data analyst.
Casey : And, my name is Casey Scott-Songin. I’m the senior manager of data and insight at The National Gallery. We are part of one of those apparently elusive data and insight teams that work within a museum. So, there have been many examples over the years of really interesting and innovative applications of machine learning and artificial intelligence in the museum sector. Things like chatbots to plan a visit, using computer vision to do different types of image searches, having AI help us generate tags on large data sets for our collection, things like entity recognition. And, we see a common theme among these projects, they’re often working on problems that arise from large amounts of collections data. How do we increase access to it? How do we better connect visitors to our collection? Things like that. But, our team started to think about what other problems we might be able to solve for the gallery with the data we had or could get. And, we realised that the potential to use AI and machine learning to process large amounts of data was significantly broader than what we had originally imagined. Why not let the computer help us for a change?
So one of the common themes, I’m sure we all share as museum professionals is that we are always extremely busy. And, the data and insight team at The National Gallery is by far no exception. One of the great things about museums is they tend to collect things, including data and that’s great. And, that means we have a tonne of all different kinds of data, hundreds of years worth. Don’t worry, it’s not personal data, so it’s GDPR compliant. Plus, the research we were doing was very focused on things that happened in the past. Why might something have happened? How did it happen? But, we started to think, could we learn from the past to try to predict what visitors will do in the future? And, how could we start to be more proactive with what we were learning? But of course, as time strapped museum professionals, could we teach a machine to process the data for us?
We’re going to show you two examples of where we used machine learning techniques within our exhibitions process. The first looks at how we tried to answer the question, how many people are going to come to a particular paid exhibition? To give you some context, The National Gallery free to enter. But, every year we have a few paid special exhibitions. As you can imagine, knowing how many people to expect impacts a whole variety of things, anything from budgeting to staffing models, to the amount of stock we should order. So, wouldn’t it be great if we could say, “Within a degree of certainty, how many people we should expect?” So, this type of forecasting is something that we first start to look at about 12 to 18 months before an exhibition opens.
It’s not as if we’ve been flying blind every year. Finance has been using historic data to model forecast for years, but they were limited by the amount of data that they could process themselves. I mean, we are only human after all. There was a conflation of targets and forecasts, and the accuracy of the model was dependent on the start of the exhibition actually opening. So we posed the question, how might we use data to more accurately predict how many people are going to come to exhibitions in the future?
Tom : So, I’m going to try and answer that question. So, before I get into the specific application for our gallery, I thought it might be useful to kind of explain some basic concepts about, what do we mean about a model? So first of all, a model is basically in this context, a way of giving the computer a load of data that it then ingests. It kind of looked at features around this data. So for an exhibition, it could be things about the season of the exhibition, the artists that the exhibition focuses on. And then, it comes out with a prediction for how many people are going to come to the exhibition. So, how do we do this? So, I’m going to introduce you quickly to three different types of models. Don’t worry, it’s all very high level. It’s not in-depth. So, the first one is a decision tree.
So, a decision tree basically takes a load of data and splits it on different set of features. So for example, as I used before, if it gets a load of data about past exhibitions, It could split it according to the season of the exhibition. It then branches this data into two sets. So, one summer, one’s not summer, and then it will do the same thing again. It keeps splitting the data into smaller, and smaller groups until it eventually gets to a small group where it can evaluate roughly how many people are going to come to that type of exhibition. Obviously, if you just kind of do this one rough split of all the data, it’s not going to be very accurate. It’s going to depend very heavily on the data that you’ve given it already. So, it might freak out a little bit when you give it something new that it hasn’t seen before.
So, one way to combat this is to have lots of these decision trees. So, if you think about one person making a prediction about how many people are going to come to an exhibition or anything for that matter, compared to a whole room full of people making that same prediction, you’re going to trust the room. So, what a random forest does is it basically uses a lot of these decision trees. And for each one, it picks a set of the features that I was talking about. So for example, season, the artist of the exhibition. And, it picks a random set for each tree. So, each tree is a different way of modelling the data. It then kind of takes an average of the prediction from all of these trees to get a hopefully more reliable prediction. But, there is one better version and that is a boosted tree model.
And, whereas the random forest just generates a load of trees and then takes the average of all those trees predictions, a boosted tree model looks at the weaknesses of the previous trees and tries to specifically improve those areas. So, the idea is that the weaker predictions get better and the already strong predictions stay strong. And then, the total prediction is a weighted average of everything that’s come before. And, this is the kind of thing we use. It’s not really without its drawbacks. This kind of model is quite easy to overfit. So, it can get quite used to the data that it’s seen before, and basically just model lapped when you give it new data. So, the skill is really in tune in there, so it’s generalizable as possible to new data.
So this is the fun bit. The features that we used in our model. So, as I mentioned before, there’s all different kinds of features that impact on a exhibition. And, if you think about an exhibition, there’s different facts about it that might influence how many people are going to come. So, the first one of these is the time of the year. So art gallery, I don’t know if it’s the same for other museums, we normally get a lot fewer people coming in September, for example. And then, these patterns are different depending on whereabouts through the year we are. So, an exhibition in September, we might expect to get fewer visitors. The day of the week, so you might expect we get more people coming on the weekends when they’re not in work. But, it’s kind of a bit more subtle than that as well.
We tend to get a lot of members that come on Wednesdays, for some reason. The proportion we are through the run. So, this kind of makes sense, intuitively if you think about it. Because, a lot of people want to come right at the start of the exhibition, and then a lot of people want to come right at the end, because they don’t want to miss it. But in the middle, we tend to see a bit of a drop-off. The artist’s popularity, again, this is kind of an obvious thing, right. A lot of people are going to come and see an artist that they’ve heard of, a Leonardo da Vinci. Whereas, something like we did a Thomas Cole exhibition last year and it was great, but not many people have heard of him. So, it didn’t get too many visitors.
Obviously, this is kind of less directly measurable than the other things that I’ve explained here. So, to try and combat this, what we did first was we did a big survey with YouGov with all, well… How many people was it? 100…
Casey : 14,000.
Tom : 14,000 people, to try and see which artists they were familiar with and which artists they would go and see a paid exhibition about. And, this was great. The results for this were really good. We used them in the model and it really improved the results. But, that was a finite list of artists. It was quite a long list, but it was still finite. And, we found that in the data, we had some exhibitions by artists that weren’t included in that list.
So, we had to think of another way around this. And, one way that we thought of was using the artist’s Wikipedia pages, as kind of a proxy for their popularity, which worked really well. It’s free data. It’s easily accessible through the Wikipedia API. And yeah, that’s kind of replaced the YouGov survey data in the model now. However, there is still one problem, all of our exhibitions, and I’m sure a lot of people’s are the same, they’re not always about one specific artists. We sometimes have thematic exhibitions. So, one that Springs to mind is we did an exhibition called monochrome, which was about black and white paintings, no specific artists.
So, we also try and factor in the kind of movement of the art, or the period that it focuses on. Again, this isn’t really perfect because exhibitions can focus on a range of movements or periods, but it’s close enough. And finally, the marketing spend, this makes sense as well, right. Like, if you spend more in an exhibition, more people are going to see it, more people are going to come. So yeah, it makes sense to put that in. However, we have found problems with this in the past that it hasn’t been recorded properly for really old exhibitions. So, it’s kind of messed up the model a little bit, so use this sparingly. I should mention as well, we actually have two separate models. One that is focused on forecasting the total number of people are going to come to an exhibition, and we normally do that about a year in advance.
And, that obviously doesn’t use things like the weekday, because it is forecasting the total number. But then, we have another model that we do closer to the time when the exhibition opens that predicts how many people are going to come on each day. But, this isn’t perfect. We’re not trying to say this is perfect. And in the future, we are looking to improve this. So, some things that we have recognised that we haven’t included that we would like to in the future are special events. So things like, you might have seen recently, there’s been a lot of press coverage about the Extinction Rebellion protests. Well in London, the Extinction Rebellion protest was in Trafalgar Square. Which, if you know London, it’s right outside The National Gallery. And we don’t know for sure, but we think it had quite a significant impact on the number of people coming to see our recently open paid exhibition.
Another example of special events could be things like bank holidays. In our recent Sorolla exhibition, for example, we saw a massive drop-off in people during the bank holiday. Which was really weird, because we were actually kind of expecting a slight increase. But, we realised it was one of the hottest bank holiday weekends that there had been for a number of years. So, we thought that might have something to do with it. So, weather is something that we would like to factor into it. If it was raining, as it is in this picture, we might expect an uptake.
So, I’ve given you brief overview of what our model is, and the things we use in the model there. But, I just wanted to briefly talk you through the process of doing this. So, the first stage is obviously gathering the data. And for the past few years, we’ve been really good at recording everything about the exhibitions. And, we do this on a daily level. However, if you go back more than 15 years, it’s not that good. So, I had to actually go down to the archives and leaf through the physical leaflets to find out whether exhibitions were paid or not. So, after you’ve got all that data, it needs to be pre-processed. So, it needs a bit of work done to it, so the model can actually interpret it. Things like transforming the date into something that the machine knows is a cyclical thing.
So 31st of December, isn’t 364 days away from the 1st of January, for example. When you’ve done that, you actually train the model on the data. And, I think a lot of people who are outside of this, think that this is maybe the hard bit. This is actually the really easy bit. This is the bit the computer does for you. You just chuck the data in and it gives you something back. However, the hard bit then is interpreting what it gives you back, and whether you need to alter that. So, parameters like the number of trees, for example, in the model like we talked about before, that’s something that needs to be tuned manually. And, there are other small details that need to be manually adjusted, so that you get the best performance out of the model.
So, once you’ve done all this, you’ve got a final model. It’s great. You’re happy with it. You think it’s got the…. You think it’s close prediction, as you are… Sorry. As close to the prediction… The prediction is as close to the actual number as it can be. And then, you can put in the new data, great. You get a prediction for an upcoming exhibition. But then, what do you do with that? How do we use that? Or how do we get people to interpret that?
Casey : So, the answer is it’s not very easy. There was some scepticism initially. People were hesitant about adopting a new system of forecasting. It’s so important to so many things that we do, trying to decide how many people we think are going to come. That looking at a new way of doing things that seemed a bit mysterious, because how will the computer know more than I do? I’ve worked here for 15 years, but… And, because we were working so far in advance, 12 to 18 months out, it’s not like you can prove it. You sort of have to say, “Okay, well just trust us, and in 18 months time we’ll tell you if we were right or not.” It doesn’t really go overwell when you’re trying to sell that. But, conveniently alongside this work, we were working in a way to provide a holistic overview of our exhibitions to the gallery.
And, we decided to incorporate this forecast into other bits of data that we were going to roll out. We hired a company called analytics engines to help us build a dashboard that they called perspective, which is what you see here. This is a tool that we use to incorporate a variety of sources about our exhibitions at a really high level. With the idea that everyone in the entire organisation would have access to this tool. What you can see here is the second type of forecast that Tom was mentioning. This is the daily forecast. So, the green line is actual attendance, and the red dotted line is forecasted attendance. It’s not that we’re perfect in the first three weeks, we actually retrained the model at week three. So, it has obviously actual accurate data for the first three weeks, and then post that it’s on prediction.
And, what you can see is what Tom was talking about earlier. Week five, the model was like, “Yeah, this would be like a fine week.” And, it turned out to be the very first sunny weekend of the year in April. And, there was a huge drop-off, things like incorporating weather data will help improve that model in the future. But besides that, it did a relatively good job of predicting. This proved to be extremely successful in terms of getting like high numbers of adoption throughout the organisation. This is a URL that people can access. We had people ask to, “Sign up,” rather just telling them that, “They now had this account.” And, we now have over 140 users of this tool in the organisation. We slowly add more things. And as we do, we continue to roll that out. People start to ask for, “Their information to be incorporated.”
And then, we can start to connect all of these dots and work on breaking down those data silos. This has led to people using this kind of forecasts in their decision making on a daily basis, which is great. It also has meant that we now get asked for a lot of forecasts. One of those things that sometimes success is not necessarily a good thing. So now, we get asked for forecasts for a variety of things, some of which we could forecasts, some of which you probably can’t forecast, but because it makes decisions so much easier when you know within what degree of number of people you’re expecting, it then becomes that much easier to validate why you may or may not need something.
Okay. Two minutes. So we have a second example, but we’ve talked a lot. Example one was about how we can look at things in the future. Example two, I just want to briefly introduce it and we won’t necessarily go into it. It looks at ways we can use machine learning to learn more about our visitors and their needs during an exhibition run, in order to better accommodate our visitors. So, this is actually once an exhibition is open. When the data and insight team was formed two years ago, it was decided to bring all of the exhibition evaluation in-house. This gave us the flexibility to enhance our collection methods and methodologies.
And, we decided to do this because we wanted to be more proactive as an organisation. We felt that waiting until an exhibition was over, and then four to six weeks after that, to get a report, didn’t really help us understand what needed to be fixed now before an exhibition closed. But, to start looking at visitor data in near real-time, you have to start collecting a lot of surveys. And, we did. We went from about a sample size of two to 300 for the entire exhibition to about two to 300 a week, which is amazing and great. But, we had a new problem on our hands. We can’t possibly process all of that visitor data in a timely way by ourselves. So, we looked at how machine learning techniques could help us process this new and vast mountain of valuable data. I don’t know if you want to like just… Okay.
Tom : So in these surveys, as I’m sure a lot of your organisations do, we have a lot of free text elements to them. And these are, traditionally, I think it’s fair to say that the hardest bits to get meaning out of, because they’re so hard to analyse. And so, this is a few examples of comments that we get on our surveys. Some of them are maybe what you’d expect, they’re about the exhibition, about people’s experience of the day. Some of them are actual reviews of our surveys. Some of them are very concise and then some of them aren’t. And, that’s actually an abridged version of that comment. It wouldn’t fit on the slide, otherwise. So obviously, there’s a lot of meaning in these comments. So, we can see in the first comment there. It mentions, the exhibition, the layout of the exhibition specifically, the audio guide. It mentions the timed operational side of getting into the gallery, and it mentions travelling to London.
So, there’s loads of information to get all of this, but how do we pull that all out easily? Obviously, this is very hard to do by manual tagging. Firstly because, if you’re manually tagging, you might miss some of these individual aspects. You probably wouldn’t tag all four of those things. And obviously, because we get a lot of these comments. [inaudible 00:22:03] Okay sorry, I’m being told I need to wrap it up.
So, basically we used an algorithm that you set the number of topics and it kind of groups these comments into topics according to words that are used together, it’s called latent Dirichlet allocation. Sorry, I’ve got to skim through this. And, these are basically the… It’s an overview of the eight topics that we found for this one specific exhibition based on the words that we used together. And then, we can use this along with some sentiment analysis to really dig into which topics people are finding a problem in the exhibition, and use this to kind of proactively or preempt the problems that people are having. Thank you very much.
Sarah: Thank you so much, Tom and Casey. I’m so sorry to rush you, but [crosstalk 00:22:59] we are getting into the coffee break. I’m so sorry. So, there’s just two questions I want to ask them. And then, if you have other questions, please, you are also available during the breaks. Interesting question here from Niels Wouters who will be speaking here also later today. Is there an opportunity to share AI prediction models across museums? And, in light issues with AI, and maybe you’ve already talked about this. And certainly also, in terms of inclusivity and bias, because it’s all very standards, and how do you handle that? And yeah, his question is how to achieve this?
Tom : So, just on that first one.
Sarah: The first one, yes.
Tom : There definitely is. Yes. And, I think that’s a really great idea. The problem we’re having at the moment is kind of to share this actual model in our organisation. We don’t really have the infrastructure for that currently. It’s something we’re looking into. So, the forecast themselves, yeah, we share those obviously. But, the actual models and algorithms themselves, it’s… Yeah, that’s something that we need to develop at least within our own organisation, but it will be great to share those across organisations. Yeah.
Casey : Yeah.
Sarah: And, the part of inclusivity?
Casey : So, we’re part of a network called the museums and AI network. That was a U.S., UK collaboration as part of AHRC funding. We’re presenting some ethics guidelines at MCN in San Diego in a few weeks. But, that’s been a lot of the conversations we’ve been having, is around how can we be really ethically minded when developing these kinds of things. So, we’ve tried to keep some of those things that we’ve developed in this, that you’ll see some of the worksheets will be published in the next couple of weeks. So, that might be a more concise answer to that question. But, yeah.
Sarah: Yeah. [inaudible 00:24:50].
Tom : Sorry. And just to add quickly on that, I think it really helps kind of doing this in-house, because we know that the things that we want to focus on and the things that we shouldn’t.
Sarah : Right. Museum people know, yes.
Tom : Yeah, exactly. Yeah [crosstalk 00:00:25:03].
Sarah: Okay, all right. Yes. And, we going to talk about digital and museum ethics later today also, so that’s… We will probably talk about that as well. And one little question, are curators, programmers, building managers, finance, etc., open to changing ways of achieving your goals actually.
Tom : Slowly, they’re coming round to it, yeah.
Sarah: Do you have a trick to share with us how to?
Casey : I think what was really great-
Sarah: Many coffee breaks.
Casey : was we had buy-in, in the upper levels of the organisation. When you go to exec and you say, “I can tell you how many people are going to come in 12 months time,” they get really excited and then everyone else has to follow along.
Tom : And, I think obviously like demonstrating that it works. Like, so just doing it, and then showing people that actually what we’ve done is really useful, is a really effective way of doing that.
Sarah: All right. Thank you so much, Tom and Casey [crosstalk 00:25:56].
Casey : Thank you.