DataTopics Unplugged: All Things Data, AI & Tech

#88 “Data Shapes AI, and AI Shapes Data,” Emilie Nenquin on VRT’s Digital Transformation

DataTopics

Send us a text

In this episode, we explore how public media can build scalable, transparent, and mission-driven data infrastructure - with Emilie Nenquin, Head of Data & Intelligence at VRT, and Stijn Dolphen, Team Lead & Analytics Engineer at Dataroots.

Emilie shares how she architected VRT’s data transformation from the ground up: evolving from basic analytics to a full-stack data organization with 45+ specialists across engineering, analytics, AI, and user management. We dive into the strategic shift from Adobe Analytics to Snowplow, and what it means to own your data pipeline in a public service context.

Stijn joins to unpack the technical decisions behind VRT’s current architecture, including real-time event tracking, metadata modeling, and integrating 70+ digital platforms into a unified ecosystem.

💡 Topics include:

  • Designing data infrastructure for transparency and scale
  • Building a modular, privacy-conscious analytics stack
  • Metadata governance across fragmented content systems
  • Recommendation systems for discovery, not just engagement
  • The circular relationship between data quality and AI performance
  • Applying machine learning in service of cultural and civic missions

Whether you're leading a data team, rethinking your stack, or exploring ethical AI in media, this episode offers practical insights into how data strategy can align with public value.

Speaker 1:

Good evening Emily, good evening Stan. I'm actually very happy to have you as a guest today, emily, because some context for the listeners maybe. We worked together for over one and a half years, I think, and for me it was a very pleasant collaboration, so thank you for being here. I will start with a script generated by ChatGPT to introduce you. So here we go with a script generated by chat GPT to introduce you. So here we go. Today, we're joined by Emilie Noguet, head of data and intelligence at VRT, the Flemish public broadcaster. Emilie has played a key role in shaping VRT's data strategy, focusing on ethical, human-centered use of data and AI in a media context. With a background that bridges data science, journalism and innovation, she brings a unique perspective to the evolving role of data in public service media. At VRT, she leads a team working on audience insights, personalization and responsible AI. We're excited to hear her take on the future of media data and digital transformation. What do?

Speaker 3:

you think that sounds great. I couldn't write that better myself. So nice to hear that. I think, yeah, it's quite a good description also of the scope of what I'm doing at VRT. And I think the scope is even larger as we are really a centralized team within VRT with a lot of responsibilities. It means insights, it means AI, but it also means the Virty profile and everything related to login authentication and also the management of the data platform. But it's quite good. Yeah, you can send it to me. Many topics to talk about.

Speaker 1:

Stan, could you also introduce yourself?

Speaker 2:

Yeah, sure, I'm working at DataWots for about three years now. I'm part of the data strategy unit, as you are. My role at Tetheroots is both a team lead and an analytics engineer, which is also what I'm doing. An analytics engineer is also what I'm doing at VRT, so that means I'm part of the data point team, one of the three data teams at VRT, all belonging to the D&I data intelligence group led by Henry.

Speaker 1:

What does it mean to be an analytics engineer?

Speaker 2:

For us at VRT, analytics engineering is focused mainly on turning raw data coming from different sources into actionable, analytics-ready data. So that means both use cases like in the AI, use case dashboarding, but also for people, like a data analyst can use specific tables to generate insights, but also data scientists can use it to make predictions. So that way it's mainly a bridging role between the raw technical data and then the business teams, the business goals, but also data people inside different data teams.

Speaker 1:

Cool, we'll talk about that later too, emily. Going back to the start of your career, maybe I saw that you started in banking, working at both ING and KBC. If I'm not mistaken, you then made a switch to VRrt, the public broadcaster. What are some typical differences or similarities you see at kbc and ing?

Speaker 3:

so maybe, if I go way back um, my journey first of all is a bit atypical. I I studied languages, followed by a postgraduate degree in business economics, and that's how I arrived in the banking world. But so I had no background in engineering or IT or whatever, and my first job at KBC was being an application manager. But I had no idea what that would, what that was or what that would be. So I rolled in that position and, funny enough, I was an application manager of an application called Corona and at that time, okay no one was knowing about the virus and it was a kind of reconciliation tool for international payments.

Speaker 3:

So it was really in the back office of KBC and there was a lot of data, but it was transactional data and that's where I made my first queries and I had a training in SAS, for instance. So that's where the world of data opened up to me. And, yeah, I was at KBC for seven years, I think, always in that back office payment business. That's also where I had my first experience about leadership. And after that I moved to ING, which was a totally different world as it was a digital world. So there I came into the bank as a role of program manager where I had to lead the digital transformation for everything. What was online banking? And there another data world opened up to me, like it was the analytics, digital analytics world, and when there was a, there came a position available of head of digital analytics.

Speaker 1:

I jumped into that position and so it was the best of both worlds. Yeah, the experiences you you had and then combine them at vrt exactly and and then when you started at vrt, what? What did vrt look like back then, because you mentioned digital transformation at kbc. Was it also ongoing at vrt, or what did the start at vrt look like in terms of data ai?

Speaker 3:

yeah, I think, um, I really um started at ing with digital transformation, in the sense that we saw the, the, the, the media, the behavior of the, the customers changing and people were not coming to branches anymore and we had really to develop online banking and mobile apps, things like that. And then, uh, eight years later, I arrived at virti and that same transformation started there. So it was about eight years behind the, the banking, uh world. So that was very, um, surprise, surprising to me, um, but yeah, it, it was like it was and, uh, it was a period.

Speaker 3:

Also, a netflix was already existing, but it was not really already embedding the daily behaviors of our, of the people, and that's why this, this transformation came, came a lot later. Um, so we had a lot of apps and a lot of websites at vrt, but, um, from a data perspective, there was not a lot of measurement on these things. So all brands had their apps and websites, um, but it was very basic. There was a very basic measurement on, okay, what were people doing on this apps and so on.

Speaker 3:

So it was really, yeah, like I was uh thrown back to the period of ing eight years before and so you could do it all over again yeah, exactly that was the impression I had, yeah it's good that you had a backpack with these experiences, then I assume yeah, and that, um, you start as, from a technology perspective, you know already, you, you tested a lot of tools and things and you know what works, what doesn't work, so that was a good, a good thing. And then the other thing is also yeah, how do you involve people in your story and how do you, um, yeah, convince people of becoming more data driven? That was also an experience I took from ing with me to to vrt.

Speaker 1:

It's always good to apply lessons learned to a new context. Stan, you also worked in banking before?

Speaker 2:

Yes, that's right. A few years ago, I also worked in the banking industry for about a year. I also definitely see some differences now working at VFD in a similar time span. For me, on a data level, there's a big difference in how the data is being collected. You have this one big customer-centric platform, so it's very structured, so you can expect how the data would look like in a way, whereas in the media industry you have these different kind of platforms, different interactions with users, adding a lot of complexity, a lot of volume to it. So that makes it way more challenging to structure the data in the same way. So definitely a difference From in terms of culture. There's also big benefits to the media industry, since in the banking industry, you have this risk-averse focus on handling AI cases, for example, or data initiatives, whereas in the media industry, you can iterate more quickly, you can experience and apply in the same sense also way more advanced technology.

Speaker 3:

I think maybe a funny story on that when we started at ING by introducing Adobe and trying to explain them that, okay, we would like to have our tracking data into the cloud of Adobe, All the risk people were very scared and said, okay, but cloud, can you explain us what it is and can we see the clouds? And so we took six months to convince them that, okay, a cloud was not really something visible or tangible. But Adobe invited them really in ireland to visit the data center and to explain these people what was the cloud. And so after they did that visit, we finally get to go to, to start implementing the adobe tracking on the website of ing and so that kind of things. I think, like you see, that it's okay, we are already along, I mean in the future. The cloud is, of course, nowadays it's obvious, but at that time it wasn't. But you have the risk people as well at ING that you have less or the compliance stuff that you have less. Yeah, definitely.

Speaker 2:

And then the product teams. You can sensitively work together or collaborate with them towards these kind of new innovations. You see that they're really eager to work with this new technology and new. Ai-based innovation. That's very cool to work with.

Speaker 1:

Yeah it's really in the mindset right being risk-averse versus being experimental and trying new things out, and so on. I think, on top of that, you're not only working in the media sector now the media industry but also there's a difference between commercial companies within that industry and the public broadcaster. I experienced it myself. There's a bigger emphasis on the public benefits of what you're doing and not so much on only the commercial aspect. How did you perceive that, or was that a mindset change you had to make?

Speaker 3:

honestly, it was a bit a relief, I think, um, because, of course, in the commercial organizations, everything is related to making profit, and then that's normal. And when you, when I arrived at vrt, that was not the first thing people talked about. It was not a topic at all. In fact, it was really linked to okay, we have a uh, yeah, a social commitment.

Speaker 3:

Uh, uh, we have a commitment towards all Flemish people. We have to reach all these, all these people, uh, whatever their background or their level of education or their age. We have simply to reach everybody, every Flemish person, and we have to inspire them and connect them. And that's really beautiful, I think, for a company as Vietti, that you can work in that kind of spirit and that all the projects you do with data, with AI, have to contribute to that, yeah, that vision, and, and that makes it even more richer. And you yeah, it's not only for ai. For instance, recommenders don't need just to have more clicks, but really need also to inspire people to make them discover different things that they would not discover themselves. And that gives, yeah, another dimension to to your work, and I also think, to the, to the work within, uh, to the, to the people within the team.

Speaker 1:

Yeah, definitely inspired me, and the way I tackle or approach new topics at new companies. I I always try to think about that too. So not only what value does it bring in terms of profit, but also how can we make a difference for people and how can we consider inclusion and so on. So for me it was also a relief to experience that.

Speaker 1:

It's definitely something that I'm still working on every day. If you talk about working at VRT and just starting your journey and getting to know all the different brands and so on, what type of data do you typically work with at media companies?

Speaker 3:

Yeah, in the beginning we started to focus on user data, so all the behavioral data coming from our online platforms, and next to that, of course, we have the VFT profile, which is more declarative data from the users themselves. So that's one big chunk of data. We have a lot of visits on our platforms, so it's a huge volume. That's also a big difference with the banking world. And then, next to this user data, we have content data, so all our content is metadata and it's like the description of the program. That's also a huge part.

Speaker 3:

And on the third, which is a bit smaller, is company data, so finance data, hr data. So we try to combine all this data on our data platform and, of course, it's went gradually. We started with user data. Now we are really focusing on that content metadata and adding some sources already for company data, but always with use cases and value generation in mind. Finance data was really added because we wanted to calculate the ROI of a program, for instance. We would not just add finance data because we think, okay, one day it will be useful, it's just really it's always with some use case or, yeah, value generation in mind.

Speaker 1:

That's really important and so you started a company with multiple brands studio brussels, radio 2, radio 1, but also vertemax, the streaming platform, verti news, the news platform, sporta sports news platform. So many different brands, many different data sources. I assume did you bring it all together?

Speaker 3:

yeah, the big challenge when I arrived at Virti was all these brands and all their own platforms. I think at the beginning we had about 70 apps and websites that we had to track and that we had to put a standard library on it to have a standard tracking. In the meantime there was rationalization on that, so we have less now. But For the tracking we only have one source nowadays and it's Snowplow, and that is really tracking all the data. So it's not that all these brands are using their own systems etc. So that we try really to limit.

Speaker 3:

But for metadata, for instance, for content metadata then you really go back to the back office of content production and there you still have a lot of systems and a lot of things to combine, to integrate, and that's, yeah, quite a huge project nowadays to also rationalize that one. So it depends a bit. I think we were lucky when we arrived at VHC to. We were lucky that there was not a lot of user measurement already in place and we could start from a blank page and implement it the way we wanted to have it. For content, of course, there was already a long time a lot of content applications and content production applications and that's really a legacy that you have to manage nowadays.

Speaker 1:

And then how many people were working in the data organization when you started?

Speaker 3:

when yeah, when I started I was alone and, um, um, a second colleague came in a few weeks after me. Um, that I, uh, I took away from ig no, it's not the right way to say that that joined me from IG as well. So we had a common base to say it like that, but we were two.

Speaker 1:

And then you start with identifying use cases, as you mentioned. How does that work? What's the first thing you do? Is it like doing a maturity assessment, getting to know everyone?

Speaker 3:

I think the first thing we did was understanding the culture of 30, because I I it's nice that you mentioned a maturity assessment. That's something I would have done at ing, for sure, but I don't think that was the right approach in a company as vrt, where a lot is done bottom up and where, yeah, relations with people is very important understanding what people are doing before dropping things like maturity assessments and you need to be very careful there. So we started with understanding the culture and we listen a lot to all these editorial people. What are you doing? How do you work? How can Deza help you in your day-to-day job? What if we would do this or that? Would you be happy with that? Can that help you in your day-to-day job? So that was really the first thing we did speaking with all these brands and all these editorial teams to understand what could be of a benefit for them in what kind of data projects could give them some benefits.

Speaker 1:

Is that the approach you're taking too, stan, because you have all these editorials, journalists that are always rushed and always in a hurry, because there's always new news. How are you experiencing that?

Speaker 2:

I find it very interesting to see both attention and the collaboration with data and AI, because, of course, these people are very creative and good at what they do, so we don't want AI to take over and write a full article, but it's very nice to see that they would love to work with data, supporting them in a way automating some stuff or some parts of the process, but also kind of advising them to do certain things in certain ways, and it's a very nice collaboration to see.

Speaker 1:

So it needs to be really close to them, to their day-to-day and really what they're focused on.

Speaker 2:

Yeah, If they see the benefit and how I experience is that they really want to work with data they can really use tools, insights to get started with the same process they used to do, but then in a more efficient way, focusing on the parts that are really important to them.

Speaker 1:

Yeah, I can remember from working together with journalists that they typically don't have a lot of time to do more long-term analysis they want to, but it's more difficult and that you really have to focus on embedding it into their workflow and their daily tasks. If you look back then, at the early accomplishments or the first win moments, let's say what would that be for you, Emily?

Speaker 3:

I think it took some time to, like I said, implement all these measurements and all these apps and websites, but once we did that, it was very nice to see that people really started using dashboards and insights and received some answers on questions they had already for a long time, and it were simple questions what are people doing on my app, on the website, what content are they consuming, how old are they, etc. And that led to some more complex questions and we became a bit, uh, victim of our own success a little small success because we received a lot more questions and a lot more complex questions and we were not able to answer them. Because we received a lot more questions and a lot more complex questions and we were not able to answer them because we had only a limited part of the data in Adobe at that time. So, yeah, that's where we decided that it would be nice to have more data together in a data lake and that we had to take a next step from the only Adobe cloud where we had the user data, to a more or larger data platform where we could combine user data with, for instance, content data to answer easy questions like yeah, what kind of people vote every week on the Studio Brussels afrekening, for instance?

Speaker 3:

That seems an easy question, but you need user data to answer the question what kind of people? But you also need content data about the votes and the voting list etc. And that part we didn't have by that time. So we needed to combine these things and to really have more data to answer that kind of questions and, of course, after that you you can have many more use cases and start again and, yeah, have some more success on that, and so you brought everything together in a data lake.

Speaker 1:

But how did it work for the vrt profile then? Did you have different profiles for each touch point?

Speaker 3:

no, um, that's also an, an um, an evolution. We had, uh, with the gdpr um, probably in 2019, I think you had the big gdpr policies and we decided there to have only one profile for all apps, all websites, and to really also put that as a mandatory thing, for instance, vertemax, and so to make sure that we would have this data along together with the behavioral data and the content metadata.

Speaker 1:

And so people started working with the data. They started looking at it. What we typically advise to our clients is indeed to start with that, because many companies want to make a big jump towards AI but they don't really know the data. How did looking at the data and getting business to get a feeling for that work in a good way to have those first AI use cases. How did that inspire them? A good way to have those first AI use cases.

Speaker 3:

How did that inspire them? Yeah, of course. Once you can give them insights or dashboards in what are people doing on my platform, you automatically also show them what people are not doing or not discovering on your platform. And then you start thinking, okay, but if we would like them to also see some more news programs instead of only fiction, or if we would like to propose a more sports program, then you start thinking together with them, ok, maybe we have to invest in a kind of recommendation system and then see in the data if, after they received recommendations, they would yeah, you have a bigger reach in sports or news or whatever.

Speaker 3:

And that's how you come to the first AI use cases. You show them first some insights and then you start In fact AI or the recommenders, for instance. It's not an objective as such. It's always because you want to have more reach or inspire people, or you have different objectives why you would use ai and why you would execute ai use cases, and that's really what you have to bring on and to discuss with them. So that's what we try to do and that's we always implement ai or recommend, for instance, and immediately after that we give them the results and we look at the figures so that you close the loop and that you always have these discussions with them.

Speaker 1:

And so you're taking a very iterative approach where one step leads to another and that way you're growing on maturity. The team also grew a lot. How many people are now part of the data and intelligence?

Speaker 3:

um today, together with the profile team, which is a bit separate team because it's our mainly developers are not really data profiles.

Speaker 1:

We are about 45 people now so you went from 1 to 45 from yeah, 1, 2 to 45. Yeah, that's right how did that go? Also very organically, or did you have a blueprint from the start?

Speaker 3:

no, the blueprint from the start was very small. We started with one data engineer who's still in the, in the, in the team and, and, yeah, again, questions were coming in. We wanted some use cases that would generate value. And then the more they were yes, we started, the more people we get into the team. But it was very organic.

Speaker 3:

You start with one data engineer, one data scientist. You start building a little or small recommender and after that, yeah, you move to another one and you add a second one. That, yeah, you move to another one and you add a second one. But also, at a certain point in time, we decided to move from Adobe towards our own tracking system, which is not our own, but we moved to Snowplow, and that also meant that the whole data modeling was not done by Adobe anymore. And then you set a next step. You say, okay, we need data modelers or analytics engineers, and then you start to adding that kind of humans how Stan came in. So it depends also a bit on the technology choices that you make. And, yeah, the choices that you make will also define what kind of people you need to add and what kind of profiles you need to add and what kind of profiles you need to add and so again, very iteratively, yeah, iteratively, based on the needs, you grow the team organically enter stan.

Speaker 1:

Um, what was it like to join the analytics engineering team, and did they already have a lot of intelligence built, or were you there at the start?

Speaker 2:

when I joined, there were two analytics engineers uh, working mainly on the core features of the platform, so building data models, growing the amount of sources that were also part of the platform. Currently we're with six engineers, so that leaves us way more room to focus on other tasks that we can pick up, for example, data quality. Instead of just delivering data to the dashboards, we can now also integrate business more early in the process, like Emily mentioned, to get to the ideal data in the dashboards. So definitely leave some room for us to also take ownership on specific products, specific tasks that were a bit more difficult in the beginning, but so the team is quite mature already. Yeah, when I joined, I was definitely a bit more difficult in the beginning, but so the team is quite mature already. Yeah, when I joined, I was definitely a mature data team in place, so that leaves us also this kind of opportunity to focus on best practices and including other teams in the process as well.

Speaker 1:

And then, in terms of collaboration I will ask the same question to you later, emilie, but being the bridge between the data platform and the data analysts and, I can assume, also business, what does that collaboration look like and how do you make sure that your solutions are fit for purpose?

Speaker 2:

On one hand, you kind of bridge the gap from starting from the sources. Our data platform is mainly focusing on bringing all the sources to the data platform. Whereas we enter a process, the business context is also taken into account. So you need to work with data to get that as early as possible, to also take into account the goals where to use and how to use the data. And then you have the partnership with other data profiles that each use the data in a different way. So the collaboration with analysts is more focused on how can we get the most insights for business, and with other data profiles, it depends on how their use cases are shaped.

Speaker 1:

And so, as promised, the question for you too, Emily, but maybe more related towards the digital transformation aspect, because you mentioned switching the tracking tracking system could you explain that in a bit more detail?

Speaker 3:

um. You mean how it works or what the collaboration on collaboration yeah, maybe first the change in tracking system why did you do that?

Speaker 1:

and and what is the difference? Yes, there, there were different reasons.

Speaker 3:

One of the reasons was we wanted to move towards recommendations and have this real-time measurement on these recommendations, and that means also that you have to, for instance, measure impressions, because you want to see what did the recommender propose to what people, and measuring impressions means a lot of volume. So there are days that we have about 1 million visitors on our websites and apps and if you have to then count all impressions, then it multiplies very quickly, and that was really the reason behind that was that the cost would explode if we would have stayed on Adobe. So one aspect cost, I think, high volume, which means high cost.

Speaker 1:

Maybe impression means that someone looked at something right.

Speaker 3:

Yeah, an impression is that we propose a piece of content to someone but the people didn't click yet. It's just we show it to you on our website. Yeah, indeed, um, so that's one thing. Another thing is that, um, yeah, the real-time aspect is an important one. Um, we had a kind of delay, um, between the data in ad in the cloud of Adobe and then sent to our own data platform. It was a delay of two hours, not that big, but for some use cases it was too much. So that's another reason.

Speaker 3:

And then, of course, leaving the black box a bit. I'm not saying that the data was not or the metrics were not well calculated by Adobe, but it was black box for us, and we want really to be very transparent and to know what we are doing and why we are doing it, also to tell it to our users in case they want to know it how does an AI model works? What's behind it? How is it calculated? Does it calculate? What kind of data does it use? And so the whole process, from ingesting the data till the end recommender, is now very transparent to us because we yeah, we have it in our own hands and so you went from a black box modeling transformation layer from adobe to creating the transformation yourself yeah what did that look like?

Speaker 1:

how? How did you came up with the right schema and transformations?

Speaker 3:

yeah, then over to stan. I think no, but yeah, the what we, uh, what we did was, yeah, starting using dbt for these transformations, and then all the details I leave up to Stan.

Speaker 2:

That's when I joined. Most of the modeling was already happening on the platform, so you had this kind of design phase which was done, but then you still need to onboard a lot of new sources, so each time you can try to think of the best solution to do so. So it's kind of a moving process, which is still the case.

Speaker 1:

And how do you deal with the fact that it's huge data not just big data, but huge data, Because I billions of rows every month or week or day, even I don't know.

Speaker 2:

The amount is indeed huge. So that was the first thing I had to focus on. I did the first adaptions I had to make in terms of modeling and design, and you really had to think first how to do it as efficient as possible. Otherwise that would lead to a bottleneck flow towards, for example, a dashboard. So it's very important to keep that in mind and then start the modeling phase.

Speaker 1:

And maybe also about data quality. Then I think there's two aspects, maybe on the more Snowplow side, but also on the metadata side you mentioned before. Let's start with the big data aspect and the Snowplow data. Are there many data quality issues, because I can imagine you have all these events, different dimensions, matched to that, and how does that work and how do you work together with product teams, for example, to solve these issues or to prevent these issues, even?

Speaker 2:

yeah. So, as I mentioned before, you have all these kind of different sources, so you cannot rely on the one, one format or the one structure that data will have in all case. So in the beginning, the goal was definitely to minimize the time to solve these issues and also the time that engineers have to spend on data quality itself. But now we have this framework in place where not only the focus is on solving issues as quickly as possible possible, but also preventing them, in a way, even more minimizing that time to solve but also was needed to really collaborate with business teams, because with the data team itself, you cannot detect every data quality issue or prevent it in the same way. So there is this kind of and I think it's due to the culture at Fiori, too that data teams and product teams also took ownership both in discovery as assessment of data quality, leading us to a very nice collaboration with different data teams.

Speaker 1:

And what does that look like? Do you have concrete examples? Is it applying validation rules or is it alerting?

Speaker 2:

but alerting. Yeah, alerting was the starting point and that leaves us the option to react immediately to issues that are happening, to avoid people using the wrong data, for example, in the dashboard. So that was definitely crucial. But then if we also take the business context or how the business process works for a product team into account, we can help them prevent data quality. For example, if they release a new software, we can collaborate with them on tracking the quality or while the implementation is happening. That leaves us way less time to resolve issues afterwards.

Speaker 1:

So again, the human aspect is super important. You need to communicate well with these teams and collaborate on a frequent level.

Speaker 2:

Yeah, exactly exactly.

Speaker 1:

And then maybe going back to the content data, because you made the three groups of data at the start of this podcast episode. Um, you mentioned that there's a lot of metadata, like the descriptions, the titles.

Speaker 3:

Are there any data quality issues with that, because you mentioned that there are multiple technologies involved and it's a heterogeneous landscape I think there, that's really a big challenge and and and we are, yeah, like I said, starting a program around that to improve the, the data quality, um, and that's because a piece of content is going through a lot of applications nowadays and that's really something we want to reduce, because what's happening if your piece of content is going through a lot of applications nowadays and that's really something we want to reduce?

Speaker 3:

Because what's happening if your piece of content is going through a lot of applications, your metadata is also going through a lot of applications, or it's even enriched continuously into this, in these applications, and so at a certain point, you don't know anymore. Yeah, what is the, the, the good metadata can we trust now, now, or is it's enriched still somewhere else and we don't have access to it? So, from a data platform point of view or data lake point of view, it's also very difficult to know with what, with, with what systems we need to connect to have the final and right metadata on our platform which is then used in dashboards or in ai applications. So it's really important to start with a good basic basis and to improve that basis, and that's what we call it content supply chain project, and that's what we want to do in that project because, yes, we have a lot of data quality issues do you have an example to make it?

Speaker 3:

um, yeah, for instance, um, uh, we work together with a lot of production houses to, to and we buy a lot of content, of course.

Speaker 3:

So, um, there is not yet the a clear intake process so that we receive the, the production, the assets, the movie or whatever from the production house and then they can send us a lot of metadata with that asset, with that program. But there is no real clear standard intake process. There is no not of like a form, for instance, where the production house has to type in all the metadata. So from some we receive a lot of metadata, from some we receive a bit less, and then it goes into our systems and then there are people that looks at it and say, okay, the subtitle is is missing or the cost is not, uh, complete. So they fill it out through the throughout the systems and at the end in the archive there is still someone checking as well uh, is it, is it complete? Do I still add thing? Need to add things? So that's quite late in the chain if you have to wait until it's archived before you're sure that it's complete, and so you mentioned supply chain.

Speaker 3:

So one of the first steps you will be taking is getting a map or a vision on the lineage of the data, the stages it goes through, the application landscape, the stakeholders that are involved autonomy, that that can be used for all the metadata throughout all these systems, but also can be shared with, for instance, production houses and telling them it's like this that we would like to receive the metadata. So it's in the start of the chain, but it's also in the middle, at the end, everywhere, where you need the same structure before you can yeah, improve the quality, and so it means that you could have to go all the way back to the production people.

Speaker 1:

How do you collaborate with these people, because typically, I don't think you talk to them very often um, yeah, we, I think we talk more and more with them.

Speaker 3:

Uh, nowadays, um because, yeah, everyone is realizing that it's really important. Again, we have some use cases, for instance, I give you an example we are nowadays discussing a lot about the new, and then I have to sit in I once translated, but I forgot what it was called in english.

Speaker 3:

Okay, but it's the contract that you are negotiating with the government and in that contract also a lot of kpis mentioned and of course, we want to be sure to to uh, to be able to measure these KPIs and one of them is, for instance, a certain number, a percentage of content should be Flemish content on our platform and then, okay, how are we going to measure the fact that it is Flemish content? Is it in the metadata somewhere? And that kind of use cases help to go back to the production people and tell them, okay, we have this kind of kpis coming into that contract, but then it's very important that it's well, it's filled out somewhere, and it's with this kind type of examples that we really can convince people that it is important. We have to report about a number of culture, uh, cultural programs, but okay, then it should be stated somewhere in the metadata.

Speaker 3:

We know in our recommenders that you are interested in I don't know, uh, a romantic comedy. But if nowhere in our programs there is romantic and comedy stated in the metadata, we cannot propose you this kind of program. So we have a lot of use cases. That makes it very tangible and helps us to convince people that it should be improved. On the other hand, we don't need to be naive. If it stays a manual process, it will never be perfect. So that's where AI is coming in as well. More and more of this kind of metadata will be generated by by ai in the future and so you will use ai to improve the data quality.

Speaker 1:

That is helping your ai algorithm exactly.

Speaker 3:

That's why we say data shapes ai and ai shapes data that's the quote of the day.

Speaker 1:

It's a good title for the episode, I think. Okay, cool, um, so we talked about data quality. I'm also very invested in the topic of data culture. Is that also something you've been working on, because I remember when I was working at vrt, it was not super easy to get the AI solutions and the data reports and so on adopted. How are you tackling that?

Speaker 3:

Yeah, it's still a challenge. I think we try to organize a lot of things to involve the people. Also, with the data analysts, we are decentralized on the floor and they are really sitting into the editorial teams and we added some data analysts in there as well. It's also their role to explain the dashboards that exist, because sometimes they even don't know, and to explain also how you can read them and use them. But next to that, what we also did and that's a very recent hire we hired communication experts really to help us, uh, in all our storytelling, because the most important thing is really the storytelling and the pr of the data and intelligence team inside the organization. So, um, we looked a bit at it from a like, marketing perspective and and, uh, we were asking, okay, but if we would be a brand, then it would be a brand. Okay, what? What are our values, what is our storytelling, what is our pitch?

Speaker 1:

and that's now something we are developing, uh, yeah, it's a bold move, but I like it yeah, yeah, yeah, we will see we do and, like I said, we just hired a person.

Speaker 3:

We have a lot of ids and a lot of uh initiatives that we would like to launch, but I cannot tell you the results yet you have to come back, cool.

Speaker 1:

So you went from a more centralized approach to a more decentralized approach in terms of um, I think, data analysts and and insights.

Speaker 3:

Yes, um in terms of modeling and and data platform. Of course we we still have this very central um or centralized approach, although we are also working on more self-service modus where we say, okay, the data platform, we are not owning it, it's not our data platform, it's open to all kind of teams or people at vrt and if you want to use it, yeah, just come and see us and we explain you and we have some self-service processes in place and you can use it and that's really also in part of data culture to all other technology teams. We open it up and we're really looking into this platform as a central platform for also more critical applications like the playouts in the future maybe, or the data feeded towards the Telenet and Proximus set of boxes. Nowadays it's done in the production flow, but one day it could be that the data is in the data platform and that we are feeding these external systems as well and become a more critical platform, and then it's maybe not the data and intelligence team managing that.

Speaker 1:

It can be another more technical team managing that you just mentioned both playouts and then also tail and at and proximus. How is vrt collaborating on a data level with other parties or companies? Are they?

Speaker 3:

um collaborating. I think um the projects where we collaborate on our data and ai are really more in the innovation domain or sphere where there are a lot of things ongoing. I think on our side we exchange data with some external partners, like with Telenet or with streams, for instance, but it's not that we have fixed collaboration partners or that we have projects on that way. If we do it, it's really more in the innovation sphere or, for instance, a solid for media project together with the government or things like that.

Speaker 1:

And I assume it's compliant, and that you put a strong emphasis on the privacy of the data and so on.

Speaker 3:

Yeah, exactly, that's also a reason why we uh, why the collaborations are very limited, because in everything what we do, um, if it's about sharing data, we ask consent to the uh, to the user. You can see it yourself. Uh, if you give consent that we can share your behavioral data, for instance, with tillinet, then we will share it. If you don't give consent, we will not share it.

Speaker 1:

So it's it's very strongly regulated and and consent management is one of the most important things, uh, in that story before asking what's next for vrt and the data and intelligence team, I wanted to go back to something you mentioned earlier, that there are a lot more questions than before because they have access to all the data and the insights. How are you balancing between doing business as usual and innovating and keep on progressing? With business as usual, I mean tackling these questions and and being more of a support role for for business stakeholders, versus then developing new intelligence and new insights and yeah, that's a good question.

Speaker 3:

I think, uh, it's still difficult to find the balance. Um, it helps also to have these data analysts more decentral, uh, so that they can um, yeah, also, yeah, educates maybe not the right word, but they can, um, can help people to become more data savvy and maybe to avoid asking the question, but to start looking themselves into the dashboards and try to find the question. So that's more on the daily things. Next to that, of course, we have a roadmap, and that's full of new initiatives and new ids, following also a bit the strategy of vrt. If, for instance, today we had a discussion on social media and how we should approach that and and how editorial teams are tackling social media and content on social media automatically, then we also as a data team have to think about okay, but how can we put more effort in measuring all the efforts that are done on social media channels?

Speaker 3:

And there are still coming social media channels, there are still being added social media channels. So maybe that's also something we should consider in our roadmap. For instance, measuring a program in his 360 degree. Um, yeah, distribution, social, but our own platforms, etc. That we don't have this? I don't think so, but we don't have a dashboard today that gives that view, for instance, of a program being on on instagram, being on facebook and also on our own platforms. So you really have to move and to um, yeah, follow the the strategies that all these brands also take and try to help them out there.

Speaker 1:

Talking about the roadmap, how do you shape it? Are you currently looking at the roadmap and how do you decide what's next for the data and intelligence team? Could you give us a glimpse of some topics that are on it?

Speaker 3:

We have some main topics, like I said, we know that we want to do everything about. So we have some big projects, let's say, uh, content, metadata, but ai as well and these will fill the roadmaps first, and then we look in yeah, just in terms of capacity, what we have still left after these big priorities, and then it will really be an effort versus value matrix. If we think that with quite few effort it can generate big value, then that will have priority first. But it's sometimes and that's a big difference, I think, with banking, for instance, where it was far more structured and we really had big sessions with big roadmap drawings on the walls, etc At Virti. Sometimes we also live in a kind of last minute. Can I say that like that.

Speaker 1:

A bit more chaotic.

Speaker 3:

Yeah, and that's okay, but that means that you also need to be flexible, um, in terms of roadmaps, and that you cannot fill it for 100 and and stick to it and, yeah, be flexible and and sometimes, yeah, like it is also in an agile way of working, change um your plans and adapt to that arcsrated chaos yes, exactly and so, looking at that roadmap, there's a new kid in town, can ai.

Speaker 1:

Is it also on the roadmap?

Speaker 3:

yes, it is.

Speaker 3:

Um, we are writing an ai strategy as we speak, in the sense that ai is coming with a lot of tools that we use adobe, premiere, mimi, you have built-in ai and their division is more looking to okay, how can we make sure that it is adopted by the people?

Speaker 3:

And then, of course, you have a part of ai, that where we call yeah, where it's built by ourselves and, um, there we categorize it really in okay, can it bring us efficiency? Let's say, automatic subtitles, for instance. Can we do that? Will it be more efficient? Will we have more programs that will be subtitled thanks to AI? Related to content there it's more in the production process, for instance, generate automatic generation of titles, summaries, things like that, where we can help the journalists to gain some time as well. And then you have everything related to AI towards the user, which is not really Gen AI, but it's more the recommender part, and it's in these three pieces, or three streams, that we are routing the vision and that we are also taking to the strategy of build versus buy and all these kinds of things.

Speaker 1:

Gen AI typically works with unstructured data think images, videos, text. Are you already tackling that within the analytics engineering team?

Speaker 3:

If you say transforming a structured data towards structured data, yes, yes, that is done, in fact, for a lot of programs and videos. Of course, we are already um transcripting everything, so, uh, from speech to text and then, once you have the text, you can get out the named entities, all this kind of thing. That these things are done already at archive side, and more and more now, not only at archive but during the, the production process as well okay, cool.

Speaker 1:

Um, thank you for giving us a view on the roadmap for vrt, very interesting. I think we can wrap up, unless you have something to mention still. No not that leads me to thanking you. Thank you very much for being here tonight. It was very inspiring and very nice to see you again. Thank you to you too, Stan, for taking us to the world of analytics engineering at VRT. And have a nice evening.

Speaker 3:

Yes, you too, thank you. Thank you, ben Bye.

People on this episode