AI at Enterprise Scale With AT&T’s Monika Malik
Building AI at enterprise scale is not just about models or algorithms. How do organizations manage governance, data, and risk while still moving fast?
This week’s VentureFuel Visionary is Monika Malik, AT&T’s Lead Data & AI Software Engineer. With a data engineer career spanning telecom (AT&T) and global banking (Barclays), she brings a pragmatic perspective on what it takes to move from experimentation to production-grade AI at massive scale for large enterprises.
This episode is a masterclass in scaling AI capabilities that are reliable, governed, and adopted by the business!

Episode Highlights
- Production-Ready AI Is More Than Model Accuracy – Monika explains why enterprise AI success depends on governance, evaluation systems, deterministic workflows, and logging… not just building accurate models.
- Escaping the Proof-of-Concept Trap – She discusses why many AI initiatives stall in pilot mode and how organizations must redesign processes, build ownership, and prepare their data architecture to scale AI beyond experiments.
- Human-in-the-Loop AI for Regulated Industries – The conversation explores why fully autonomous AI isn’t realistic in regulated sectors and how human oversight, role-based access, and audit trails ensure trust and compliance.
- Balancing Speed, Stability, and Cost – Monika breaks down the engineering trade-offs teams face when building enterprise AI systems and how leaders must choose between rapid experimentation and stable production systems.
- AI Success Is Measured by Business Impact – She shares why enterprises must define clear metrics — like reduced cycle times and real user productivity gains — to prove that AI initiatives deliver measurable value.
Click here to read the episode transcript
Fred Schonenberg
Hello everyone and welcome to the VentureFuel Visionaries Podcast. I'm your host, Fred Schonenberg. I'm the founder of VentureFuel and I'm joined today by Monika Malik. She is a lead data and AI software engineer at AT&T, where she works to design and deliver production-grade data pipelines and AI-ready systems at massive scale.
With a career spanning telecom and global banking, including years as a data engineer at Barclays before joining AT&T, Monika brings a pragmatic perspective on what it takes to move from big data foundations to AI capabilities that are reliable, governed, and actually adopted by the business.
Today, we're going to dig into what production-ready AI really looks like inside a large enterprise, the biggest challenges teams face when scaling AI beyond pilots, and the engineering decisions that make the difference between impressive demos and actual durable systems. We are both judges for this year's Consumer Electronics Show and I knew we needed to have her on the show, so I am so excited to welcome Monika today. Monika, it's so nice to have you.
Monika Malik
Thank you, Fred, for having me in the show.
Fred Schonenberg
So, you're building AI systems at such a massive scale inside AT&T. I'm curious, what does production-ready AI actually mean and how is it maybe different than what most people might think?
Monika Malik
So, production-ready AI meant, like a lot of people, they have this misconception of model accuracy, whereas if we think about like when we are building AI-driven solution, it has to, you know, we have to focus on a lot of things, like we have to have a system redesign process where we have the controls in place, we have governance in place, and we have those evaluations on that AI-agentic workflow. And the most important part is like every, we have a right logging system in place, like it's not because, so currently I'll give you a background, so I'm working on the finance transformation project here at AT&T, and we are trying to we are trying to build AI-agentic solutions to replace manual general entries, which is currently a heavily manual process by analysts to write those generals in the books.
So, if we are trying to automate, like we have to think about are we, do we, is the governance in place? If something is failing, do we have humans in the loop? And also, in that whole workflow, we cannot have everything as probabilistic or relied on agents. We have to have the deterministic component in there. For example, for this MJE, if there are some calculations, we have to have a tool, like a Python-based script, so agents cannot make those decisions on behalf of humans. So, I believe, like the most important, these are the few most important things we have to have in the production-ready AI workflow, like we have to have a proper logging system, we have to have proper evals and the governance and immutable logs should be there.
Fred Schonenberg
Now, I think one thing that this brings up to me is like, you know, so many enterprises are stuck in what I'll call proof of concept mode. They get very excited or someone in the organization gets excited about agentic AI, but then it gets, they're just doing pilots, they're doing tests because they don't have these systems you're talking about that prevent the pilot from becoming a durable or business-critical system. How do organizations avoid that trap?
Monika Malik
So, in the POC mode, a lot of time when we do, we cannot treat any solution as an experiment, we have to focus on redesigning the process itself. A lot of time you will see when we are, when we are doing a POC, we are using the curated data set instead of, you know, it doesn't reflect the massive production data. So, we have to think about whether it is going to use data at scale, what are the challenges we are going to face?
And a lot of time, like we have to make a behavioral change here. We have to have those controlled frameworks and the systems or the platform built instead of treating it, AI solution just as an experiment. So, we have to have ownership, like if there are no business users or if there is no governance fear, we have to take into consideration risk, legal, compliance, block, and scaling. We have, we need all those things. Then only we can think about a solution at enterprise level and we are not going to get stuck in the POC mode.
Fred Schonenberg
How do you maybe take that idea with a C-suite leader who says we need more AI. What needs to be in place already to make that ambition realistic?
Monika Malik
So, we need to have a process clarity. Like in my current work, let's say if we are trying to automate manual generation or this finance work. If we have, let's say, 100 finance analysts who are working across the different teams, we cannot build a common agent for them. So, basically every requirement will have a unique solution and we have to focus on it, like it has to have a process clarity. Then along with that, we need reliable and governed data.
We have to have a clear lineage and we have to define the probabilistic deterministic component within the workflows. Then like we have to have that platform where we can say, okay, we are, if we need the agentic solutions, these are the must things that already exist in the… we already should have all those things in place. Like we need to have an event-driven architecture or secure cloud infra. Most of the companies would already have that, but these are must have for AI, if we need AI today.
Fred Schonenberg
Absolutely. One of the things that I thought was really interesting about your background is banking and telecom are both very highly regulated industries and AI and agentic, all of that is, you know, very cutting edge. Can you talk a little bit about that friction of wanting to automate and find new solutions, but also needing things to be governed, auditable and trusted?
Monika Malik
Yes. So, like I mentioned, in each agentic flow, we have to have a human in the loop or over the loop. We cannot just say we are going to replace humans. For any regulated system, we need, basically at critical control points, we need humans. Let's say a human, an agent is preparing the report. We have instructed, we have put all the instructions in there, but he has prepared the final report, but there should be a human who can review the system and then only it should be uploaded in the book rather than the agent is doing everything on its own.
We have to have those controls by design. We need to have role-based access. Not every agent can have access to everything or every data present in the company. We need those restrictions over there. Then we should have a proper change management system. If we are proposing, like, if let's say in our current solution, we got some new upgrade in the model, we are making those changes, but we have to have this in a controlled version control or ownership where we can go back and trace those changes, like who made that change and what was the outcome of that change. And yeah, we have to have those traces in there and then only, and those evals in the place, then only we can, you know, design an AI solution for regulated industries.
Fred Schonenberg
It's so interesting. I wonder if you could talk maybe about the trade-offs, because obviously you want stability, you need control, you need security, but those often come at with tension with things like speed and flexibility and scale. How do strong teams like yours overcome and stay ahead of that sort of friction?
Monika Malik
So, these trades are constant. Like sometimes, let's say speed versus stability. If we are doing a POC or if we have to do an experiment in a sandbox, then we can focus more on speed versus stability. But if we want to build a slow controlled system, then we have to focus on stability.
Similar for cost and performance, if you are doing a small model for classification, that maybe you don't have to use, you know, big LLM or the latest LLM models. But if you need a proper reasoning model for exceptions, or if you're building some complex system, there you can, you know, use the latest LLM models, which could be expensive.
Fred Schonenberg
Very interesting. Let me ask you this. One of the things people have been talking about is there's the promise and the hype of AI versus what is being delivered today. And so, at what point does an AI initiative start delivering business impact? Like when do you start to see those signals that you can say, hey, we're seeing adoption in this that you can share back with leadership?
Monika Malik
Yes. We have to have a metric in place where, you know, if let's say I'm proposing a solution, which is automating current manual work, then we have to check if the person who is reviewing that workflow is not spending more time than as compared to, you know, when the person was doing that work manually.
So, in the case of this MJ example, if we have those closed cycle shortens, let's say an analyst today spends 15 days for a book closure, then when we have this agentic solution in place, he is spending maybe two to three days for the validation part. We have to have those metrics in place where it is really helping the user or the person for which we have built that solution. And the user is not frustrated or the model is not hallucinating.
And what is the accuracy? Because if every time a human is reviewing and it is saying that the validation is not proper, and he's doing that work manually, then we cannot say that we have delivered a good AI project. So, by looking at those metrics, we can measure the impact of AI-delivered projects.
Fred Schonenberg
And those metrics almost need to be customized for each program, right?
Monika Malik
Yes, absolutely.
Fred Schonenberg
Trying to solve and then creating that KPI or metric specifically to show that this initiative has the potential to solve it in a big way.
Monika Malik
Right. Yes, correct.
Fred Schonenberg
Very interesting. What do you think the biggest misconception is for senior leaders about scaling AI? And is there a way to reset those expectations?
Monika Malik
Yeah, so we have to stop thinking of it as an experiment. And, you know, in every five months, if we are getting a new LLM model launch, we are going after the LLM accuracy or the model accuracy, rather than we have to focus on if we are scaling AI, do we have the data architecture in place? Are we focusing on the process redesign? Because currently, all the applications we have, if we want to, you know, add that AI component, and if we want to automate those using AI, then we have to think about the process, redesigning, change management, governance, engineering.
I'll give you an example. Currently here at AT&T, instead of like, as a software engineer, we started with white coding, right? And you will just prompt to any LLM you are using for your work, and it will give you the code. So instead of that here at AT&T, what we are trying to do is we are building archetypes.
If I want to do documentation for my project, instead of, you know, putting that particular requirement in the prompt, we are going to use that archetype. That archetype will have the guardrails in place, so that it is not it is, you know, giving the solution as expected, and we have those securities as well in place. So we have to focus in that manner as well.
Fred Schonenberg
Yeah, it's very interesting, right? This is one of the tensions throughout organizations is you, one of the beauties of let's say, vibe coding, is that anyone can pick it up very quickly. And so you want to take advantage of a huge company like yours with tens of thousands, hundreds of thousands of employees, you want to unleash them. At the same time, you have to do it in a way that does not introduce too much risk.
Monika Malik
Yes, that's why we are calling it AIFC, AI-fueled coding, instead of vibe coding, where we are designing these archetypes. And we are, we are going to use those archetypes instead of a simple prompt.
Fred Schonenberg
I love it. What was the AI feud code?
Monika Malik
AIFC, AI-fueled coding.
Fred Schonenberg
Very cool. Yeah, it's very, very interesting. Because I think that's one of the tensions is you want everybody to take advantage of these tools, but you also need to protect. If an enterprise wants to responsibly scale AI over the next, let's call it one, two years, what do you think are the first couple of investments or areas that they should prioritize?
Monika Malik
We have to have this AI governance framework where we have this model registry versioning audit system in there. Then from the data perspective, we have to have this unified data lineage platform. Here at AT&T, what we are doing is we are creating MCP servers for the data products and for the tool part, like I mentioned about this probabilistic, like where we have those Python scripts for the deterministic part of the workflow.
There, we will deploy it as an MCP server. And then if I talk about in a generic way, then we have to have some reusable agents. We have, it is good to have those archetypes. It's good to have a prompt library as well. If I'm building a linear regression model and I'm using a certain prompt, then if I'm keeping it, if I have a library, I can share it with the other developer as well, and they can just tweak based on their requirements. It’s very important to have an evaluation framework.
I've seen a lot of people when they are building the workflows, these agentic workflows, we don't have, we are just focusing on the model accuracy and we don't have any eval. So I'll just give you an example. Let's say in my workflow, I have two agents. One is a data compilation agent and one is the calculation agent. A data compilation agent is responsible for all the data calculations or the transformation, preparing, basically reading the input data from the various data sources. And then it is applying some transformation in it, preparing the data.
Then we have the calculation agent. When this workflow executes at the end of the workflow execution, we need evals. Currently we are using a third-party system, in which we have written those evals to make sure in the execution, we can see the tool sequence, the agent sequence that the first data compilation agent is executing. And then we have the calculation agent. Then we have the validation agent, because in this scenario, the sequence is also important.
Like you cannot run your validation agent before your data completion agent. And also like, let's say if we are preparing one final report, we have those columns in the report. All those validations are really important and that will give us confidence that the workflow is executing as expected of what we have written and what we are asking it to do.
Fred Schonenberg
Yeah. I think it's really interesting. And I was going to ask this question in a different way, but you just mentioned the third party. I'm curious what your stance is in AI around, we talk a lot on the show about buy, build or partner, right? Should you build it internally? Should you acquire somebody by them or is it, or partner with an external partner? What do you think about that in the world of AI, which I think is kind of shaking up how most companies have traditionally approached the buy, build partner?
Monika Malik
So I will say platform, every enterprise they have to build. Internally in-house, they can buy, like we have a lot of these good companies, open AI, cloud. We can buy those already LLM models. We don't have to create from scratch, like write our own LLM from scratch. Maybe SLM, we can do it, but there's no point in spending more time over there. And then for the partnership, we can like, we can use the infrastructure, like cloud, we can use Azure, AWS or GCP. So there we can partner with third parties or for Arise, like for these evals, we can use companies like Arise.
Fred Schonenberg
Very interesting. So looking three to five years out, what do you think is going to separate large enterprises that are operationalizing AI from those that maybe get stuck in experimentation?
Monika Malik
Yeah, if companies are treating AI solutions as experiments, they are not going to win. So enterprises that will win, will have, will treat AI as an infrastructure, rather than experimentation, redesigning their current process, not just automating them. And then build internal AI capability. Like we cannot just rely on outsource intelligence. And we really have to invest in AI literacy across the business unit.
Fred Schonenberg
Very interesting. Okay, so we're going to, I'm going to get you out of here on this, we're going to do something we call rapid fire, which is, I'm just going to give you a couple different topics and give me a quick, you know, one sentence or so answer so that we can cover a whole bunch of different areas. Are you ready?
Monika Malik
Yes, I think.
Fred Schonenberg
All right. So just give me a description of the current state of enterprise AI.
Monika Malik
Transitional, still like a learning phase. It's a bit transitional.
Fred Schonenberg
What do you think is the most overrated AI capability right now?
Monika Malik
Replacing humans, like we do not have fully autonomous agents, we need human in the loop.
Fred Schonenberg
On the opposite side, what do you think is the most underrated?
Monika Malik
Data lineage. But as a data engineer, like I would say like, that's the heart of every application.
Fred Schonenberg
Yeah. What is one metric every enterprise AI leader should be tracking?
Monika Malik
So business value per production AI system, like if you're building something, what is the business value coming out of it?
Fred Schonenberg
I love that. What do you think is the biggest hidden risk in scaling AI?
Monika Malik
It might be creating a lot of governance debt, is what I feel.
Fred Schonenberg
Very interesting. Where do you think AI will create the most value? Do you think it'll be on cost reduction or on generating new revenue, new opportunities?
Monika Malik
I feel the short-term goal is cost reduction. But if we look at the bigger picture, then it is like creating new operating revenue models.
Fred Schonenberg
Monika, thank you so much for sharing all of your insights and your time today. It was incredibly interesting. And is there anywhere you'd like to direct the audience if they want to either get to know you better or learn more about what you're building?
Monika Malik
Yes, they can always reach out to me on LinkedIn and yeah, we can have a talk. And I'm also in the learning phase, but it's a great journey and I'm very fortunate for my company that I'm working in such an interesting project where I'm learning a lot about AI and building these agentic solutions.
Fred Schonenberg
Well, thank you. I think that's the right attitude, right? All of us need to be in the learning phase and approach this as such an opportunity to learn. So thank you, Monika. Really appreciate your time today.
Monika Malik
Thank you, Fred, for having me. Have a nice day.
VentureFuel builds and accelerates innovation programs for industry leaders by helping them unlock the power of External Innovation via startup collaborations.
