Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society.

Dive into conversations that should flow as smoothly as your morning coffee (but don't), where industry insights meet laid-back banter. Whether you're a data aficionado
or just someone curious about the digital age, pull up a chair, relax, and let's get into the heart of data, unplugged style!

In this episode, we're joined by special guest Vitale Sparacello, an MLflow Ambassador, to delve into a myriad of topics shaping the current and future landscape of AI and software development:

MLflow Deep Dive: Exploring the MLflow Ambassador program, MLflow's role in promoting MLOps practices, and its expansion to support generative AI technologies.
Introducing Llama 3: Discussing Meta's newest language model, Llama 3, its capabilities, and the nuanced policy of its distribution, featured on platforms like Groq. Read more here.
Emerging AI Tools: Evaluating Open-Parse for advanced text parsing and debating the longevity of PDF documents in the age of advanced AI technologies.
OpenTofu vs. Terraform Drama: Analyzing the ongoing legal dispute between Terraform and OpenTofu, with discussions around code ethics and links to OpenTofu's LinkedIn, and their official blog response.
The Future of AI Devices: Are smartphones the endgame, or is there a future for specialized AI wearables? Speculating on the evolution of AI devices, focusing on the Human AI Pin review, Rabbit R1 and Dot Computer

Go check out the Youtube video after so you don't miss out Murilo in that suit he promised (with a duck tie of course).

Listen on

Apple Podcasts Spotify Amazon Music Podcast Index Overcast Stitcher +

Share Episode

Share on Facebook Share on Twitter Share on LinkedIn

MLflow Deep Dive: Exploring the MLflow Ambassador program, MLflow's role in promoting MLOps practices, and its expansion to support generative AI technologies.
Introducing Llama 3: Discussing Meta's newest language model, Llama 3, its capabilities, and the nuanced policy of its distribution, featured on platforms like Groq. Read more here.
Emerging AI Tools: Evaluating Open-Parse for advanced text parsing and debating the longevity of PDF documents in the age of advanced AI technologies.
OpenTofu vs. Terraform Drama: Analyzing the ongoing legal dispute between Terraform and OpenTofu, with discussions around code ethics and links to OpenTofu's LinkedIn, and their official blog response.
The Future of AI Devices: Are smartphones the endgame, or is there a future for specialized AI wearables? Speculating on the evolution of AI devices, focusing on the Human AI Pin review, Rabbit R1 and Dot Computer

Go check out the Youtube video after so you don't miss out Murilo in that suit he promised (with a duck tie of course).

Speaker 1: 0:02

You have taste in a way that's meaningful to software people.

Speaker 2: 0:07

Hello, I'm Bill Gates. I would recommend TypeScript. Yeah, it writes a lot of code for me and usually it's slightly wrong. I'm reminded, incidentally, of Rust here, rust.

Speaker 3: 0:24

Congressman, iphone is made by a different company and so you know Rust Well. I'm sorry, guys, I don't know what's going on. Thank you for the opportunity to speak to you today about large neural networks. It's really an honor to be here, rust. Data topics Welcome to the data topics. Welcome to the data topics podcast.

Speaker 1: 0:56

Welcome to the Data Topics Podcast. Live streaming on youtube, linkedin x. And what have you? Uh, check us out. Twitch, twitch, twitch. That should have any updates on the twitch page, or you just know only that we're streaming there.

Speaker 1: 1:11

Only we're streaming there, but it's okay, we gotta start somewhere and there that's where we are. Um, feel free to leave a comment, question anything. We'll try to address it. Today is the 23rd of April of 2024. My name is Murillo. I'll be hosting you today, joined by the one and only Bart Hi, and we have a very special guest. Yes, of course, my favorite song. I hope people can listen to his song too. To Vitalis' favorite song, favorite song right only for ski trips.

Speaker 2: 1:58

So yeah, vitale the one and only our italian wonder boy tech lead ai at the roots. Yes, ml flow ambassador. Ml flow, one of the eight in the world MLflow ambassadors, recently became an MLflow ambassador. Happy, you're back on the podcast thank you.

Speaker 1: 2:11

Thank you for having me yes, are you the only MLflow ambassador in Belgium? Yes, yes look at that. So if you go there MLflow ambassador program and you're going, gonna see the ambassadors here and you can well, who's that? Wow?

Speaker 3: 2:32

that's it was a cool picture.

Speaker 2: 2:34

Right, that's a cool picture are you also the only italian ambassador? Uh, yes, yes the only in belgium and the only italian. Yes, wow, look at that. And the cutest for sure.

Speaker 1: 2:43

Yes, wow, look at that, and the cutest for sure.

Speaker 2: 2:45

Definitely. If you look at this, it's definitely the cutest right.

Speaker 1: 2:47

Definitely by far. Not even a competition man.

Speaker 3: 2:50

I think there is someone else, not Italian, but he's working in Italy.

Speaker 2: 2:55

I thought you were going to say more cute.

Speaker 3: 3:00

They are really cute. It could be, yes, but it's actually a new program from MLflow. Mlflow is now let's say not now, but the Linux Foundation is putting an extra effort in promoting the adoption of this open source technology and they are starting new programs and YAMLflow Ambassadors is one of them. This was the first batch of ambassadors. They will soon open again, let's say, applications. So if people are interested, then you can always look at the website. The main goal is to promote the technology by knowledge sharing with your community within your, let's say, working environment, but also with external people, and you can also contribute in the blog post and in social media of MLflow.

Speaker 1: 3:57

Maybe a question for people that never heard of MLflow what is MLflow?

Speaker 3: 4:02

Yes. So MLflow is an open source framework. They are call it now a platform that helps you to manage your machine learning life cycle. So when you train machine learning models not, let's say, in a trivial environment not trivial, but for simple use cases for example, if you are working on real-world data or you have your first, let's say, professional engagement you maybe want to use MLflow to track your experiments in order to check, by changing your code, your parameters or the data itself, what is making your model performing better. Also, to have a centralized repository to store your models for later deployments.

Speaker 3: 4:53

And also MLflow helps you to deploy your model on different targets. For example, from MLflow you can deploy on SageMaker, databricks, azure ML or even on on-premise Kubernetes clusters. So it may sound complicated, but it's quite easy to use, because MLflow comes with a Python API so that you can basically interact with it by code, and also an open source server that you can run. Basically, you can host yourself, you can run it everywhere you want and it's full, let's say it's feature rich so you can do everything. You can also yourself. You can run it everywhere you want and it's full, let's say it's a feature rich, so you can do everything you want with it.

Speaker 1: 5:31

Yeah, I think. Maybe the way I also think of ml flow is that it's open source, but it's also the ml ops tool like the standard today in the industry. Would you say that's a fair statement.

Speaker 3: 5:46

I think it is this because MLflow is creating a big community around and is investing a lot in changing the product in order to meet the demand of the people. For example, as you can see in the webpage that you were sharing before now, the focus is not only in classic, let's say, machine learning, standard machine learning and deep learning, but they are also shipping a lot of new features related to generative AI. For example, let's say, you would like to prompt engineer a solution. Instead of randomly try, test, change and retest again your prompts, you can create a formal experiment in MLflow in order to perform a more robust evaluation. So, since these applications are becoming more and more frequent in industrial environments, in business environments, so companies are shipping generative AI models in production, prompt engineered from a base model, for example, the GPT models from OpenAI.

Speaker 3: 7:00

Maybe you want something robust to track how you are performing your experiments and to test it, because we saw already in the past how many issues these models have, for example, chatbots suggesting your competitor instead of your own companies or creating, let's say, random discount codes for users. So if you include in this application a robust test suite, you can better test your prompts and your overall product.

Speaker 1: 7:32

Cool, very cool, a lot of cool stuff. I didn't know about the GNAI thing, actually, but it's very new and I think also MLflow has been there before. Mlops was a thing even. I remember hearing about MLflow, the experiment tracking, before people were saying using the term MLOps, and also there are other competitors just to like dvc, I think, is another popular name. There's also gilai, there is also. There's like you can pick your own ml ops, experiment tracking, all these things they are very cool.

Speaker 3: 7:59

Yeah, they are valid alternatives of, of course, dvc especially is also a well-known tool. I think DVC focuses a lot on data versioning and pipeline versioning, for example, to have an easy way to track, for example, if your data is changing or if your code to train your model is changing, and then adapt your flow to reproduce your model is changing and then adapt your flow to reproduce your experiment only when is if there is a substantial change in your data. While ml throw addresses challenges as uh, experiment tracking the model registry itself, it doesn't rely on, for example, git, like dbc does, and also model deployment, so we then integrate a server. Really cool, happy to have you here.

Speaker 1: 8:50

Thank you, yeah, how and how? How are you doing? I think we skipped a bit of chit chat. But uh, yesterday vital and I were playing futsal together and there was a moment that I was like I looked into vital's eyes.

Speaker 1: 9:00

I saw his soul through his eyes I was like I was very intimidated, but then I was like, and I was like mentally, it was like, come, come on, give it the ball come and actually like I swear he heard me because he was like and then he actually went and, um, I passed in the ball and the rest is history back of the net wow, yeah, we won rats.

Speaker 2: 9:21

Yeah, can we?

Speaker 1: 9:22

get an applause Congrats. Can we get an applause? Just a regular email.

Speaker 3: 9:31

We are first in the league at the moment. Oh really, yes, in the futsal league.

Speaker 1: 9:36

But that's not DataFoods. That's the team of Jonas.

Speaker 2: 9:43

And maybe something the elephant in the room here, or maybe the elephant in the room here, or maybe the duck in the room here. For the people that are listening that can't see. All the others have wondered by now what is happening, but Murillo is here in a suit with a very special tie. Maybe you could explain a bit about the occasion well, I think it's a bit disappointing.

Speaker 1: 10:07

It's not really an occasion. I think last time you said we're talking about suits and you said and I said I'll come with the suit next time, and I feel like this is kind of a suit. It's still kind of casual, right, it's not something for a formal occasion, but I also I have a dutch classes later, so I thought it would be weird if I just come super formal and I'll have to explain it to people. So I thought this was a nice compromise in the no, it's fancy, yeah it looks good on you oh thanks especially the tie yeah, the tie.

Speaker 1: 10:31

So what is about the time?

Speaker 2: 10:32

maybe you want to paint a picture with words here the tie has a lot of ducks on it right, a lot very similar, but not in the pixel art thing that is on top of me.

Speaker 1: 10:44

Yes, um, and I'm more surprisingly, probably, is that I had this at home. So, um, I was like thinking like okay, yeah, suit, and like, okay, the ties.

Speaker 3: 10:53

I have the perfect tie, and it was this one it was all meant to be it was meant to be so you will go to your dutch lessons with not at all.

Speaker 2: 11:04

I think that would be next level.

Speaker 1: 11:06

It would be too much.

Speaker 2: 11:08

Just see if anyone says anything.

Speaker 1: 11:10

Yeah, but actually there could be good advertisement, right, like oh yeah, because we do this podcast, you can check us out on YouTube and then you know, Do it, it's an order, or maybe, I think, murilo, to test how, how serious you are.

Speaker 3: 11:26

I think you should go to a client meeting with this guy.

Speaker 1: 11:29

If bart gives me the green light, maybe I will.

Speaker 2: 11:31

I think it's a.

Speaker 1: 11:33

It's a nice, uh conversational piece I think so, but it would be super awkward too if, like if you go in a meeting and no one asks any question, you know like they're just like oh yeah, how was your, how was?

Speaker 2: 11:42

everybody will just think like this, just weird guy. Let's just ignore this that's the goal.

Speaker 3: 11:46

It doesn't matter what you are wearing, but people will believe what you're saying.

Speaker 1: 11:50

So how you think, I don't think this is because you are so good that it doesn't matter what's your dress, how you dress not sure about that. We'll have to circle back on that one. But um, how's your weekend, bart? Maybe I don't want you to feel excluded.

Speaker 2: 12:06

Anything special my weekend was very good. I went uh ski mountaineering oh, this weekend last weekend yeah, a long weekend um went to austria. We had uh by car wow um, we climbed a few nice peaks. Weather was very bad though, but uh, snow was good oh really, there's still a lot of snow because the weather was very bad. Yeah, I see, I see, okay, interesting.

Speaker 1: 12:35

Wow, so you just went to austrian back on a weekend, whatever long weekend, long weekend okay pretty cool. Pretty cool and all right. What do we have this week? Maybe we can start with this timely news. Also, you mentioned Gen AI on your MLflow intro, vitaly. I think it's hard not to talk about Gen AI these days.

Speaker 2: 12:56

What is not Gen AI these days right?

Speaker 1: 13:00

Yeah, it's hard to keep track. What is this? Introducing the Yamahaama 3, the most capable openly available llm to date. This is from facebook and um, what is this about? Maybe I'll put the name, or barter vitalia, I guess, put part um, so this was introduced, I don't know.

Speaker 2: 13:21

Actually I think it was introduced when I was uh, when I was uh out on know. Actually, I think it was introduced when I was out on holidays, more or less, right.

Speaker 3: 13:29

Yeah, it was five days ago, more or less.

Speaker 2: 13:33

So I haven't been able to test it a lot. So Lama 3 is the successor to Lama 2, which is Meta's or Facebook's LLM, which is was and I see it now in the title. Now that it springs in my eyes is that it was originally touted as the most capable open source LLM, but apparently Lama 3, they're positioning as the most capable openly available LLM, which I think is more correct if you read the small details.

Speaker 2: 14:04

Yeah, the fine print um, I have uh seen it a lot passing by on socials reddit, twitter um my mastermind account uh and I, what I hear from the community is that is that it is actually very, very, very performant performant in the quality performance as in the quality of results, and that for specific use cases it is as close as or outperforms gpt4.

Speaker 2: 14:32

That's very impressive, um. I've tested it myself a little bit on grok, so that's maybe the easiest way to test it. You go to uh g r yo g r o q dot com, grog uh, you log in and then you can basically select which lamma tree version you want to uh use. They have an um uh yeah, that's the one different a 70b and a 8b version available, each with a different token context. The default is 8,000 tokens. It's very easy to test it, but you need to make an account on it. I tested it a little bit. Just by the little tests I did, it's indeed very comparable to GPT-4.

Speaker 1: 15:22

Maybe you said they changed the naming to open source, to openly available well, it's now that I see it on the title.

Speaker 2: 15:31

The title here of the article that you have on the screen is introducing meta llama tree, the most capable openly available lm to date. But then I know that they touted it before with as an open source model but then so which gave them a lot of flack.

Speaker 1: 15:44

Yeah, but we still can take the weights.

Speaker 2: 15:47

Yeah.

Speaker 1: 15:47

Yeah, you can still use it. So anyone can use the weights. We can download it, we can put it. So why is it not open source? It's because there are some restrictions on how to use it.

Speaker 2: 15:55

There are restrictions. They have their own license, I think, on this, which is not a quote-unquote open source license, and there are some limitations on how you can use it, for what amount of users, these type of things. Uh, it's also not well, I don't know. Actually, lamotree haven't downloaded the previous one.

Speaker 2: 16:14

You had to fill in a form like I want to download this, give some details, um, which is probably still the case it would be the same yeah, yeah, so it's a very uh open model or openly available um the both the source code as well as the weights, not the data um I think what meta tries to do is, uh, to protect the model.

Speaker 3: 16:38

um, let's say, if you are, if you are a large organization and you maybe want to embed their model in a sort of platform or widely available products, they protect against this kind of use cases. For example, let's say Apple wants to use it to empower Siri in order to be business, research institute or an individual. You can basically download the model and do what you want with the weights, also fine-tune it, retrain it, use it to produce new data, as a sort of training set from each generation to create more specific models training set from his generation to create more specific models.

Speaker 1: 17:28

Yeah, I think in the llama 2 is the one that there was the was llama 2, the drama that someone leaked the weights without the form it was one or two indeed.

Speaker 2: 17:34

Yeah, maybe I thought it was um, and I think, because it's openly available, but, like both the weights, the source code, that that you can um, you can fine-tune it, you can retrain it, you can easily deploy it yourself is that it is probably one of the most widely self-managed models. That is actually being, uh, if you deploy it yourself, like if you do something on-prem or in your own private environment, um, it's probably the most widely used.

Speaker 3: 18:03

Yeah it's also, I think, the most powerful indeed up to date. And it's also crazy how fast is the, let's say, open movement I don't want to say indeed, open source, open movements of llms going, because, uh, let's say, we saw open ai with GPT four years ago, I think, when GPT got released, it was the first big step in advance.

Speaker 3: 18:28

We thought like oh, this technology is so unique, it's so, let's say, hard to replicate. You need hundreds of GPUs, you need a lot of money to train something similar, to craft a big enough training set for that. But then, after a couple of years, more and more companies are replicating and going closer to GPT performances even to GPT-4, and it's available for everyone. I think it's very cool. Of course, it's not easy to run. You still need some compute power but the fact that it's making let's say they are making it available to a larger audience. It will allow for, for example, optimizations, quantizations, so maybe more and more people will be able to use something similar in a protected environment.

Speaker 1: 19:18

Yeah.

Speaker 3: 19:19

So I think it's very cool.

Speaker 1: 19:20

True, and just to wrap this part as well, that, according to the Verge, that according to the Verge, according to the Verge, meta says that LLAMA 3 beats most other models, including Gemini.

Speaker 2: 19:31

so indeed, and the subtitle there says actually, indeed, that meta themselves. They don't mention anything about performance versus GPT-4, which is true, but you see there's a lot of chatter about how it compares from a more qualitative point of view.

Speaker 1: 19:50

Testing it out to gpt4 which apparently is impressive I do think that gpt4 is the standard still right, so I think that's the it's argued that claw 3 is very close, but yeah, but I still feel like if some, if there's a new model that comes out today, they're gonna compare it to gpt4, right?

Speaker 3: 20:07

yeah, that's true, I agree with you, yeah something I really really like about the lama model and meta research in general. This is a weird sentence. I think if I would have said like I really like meta or facebook three, four years ago, everyone would have been like, oh, you're weird, because it was the bad company right online. But now they are releasing all this technology and they are also releasing all the white papers summarizing a bit the research and what they implemented in the model itself. And it's quite interesting because then even larger companies can somehow adopt strategies, methodologies or particular, let's say, tricks. That makes this LLMS a bit more performant.

Speaker 3: 20:57

I still remember last year for a Microsoft event, andrej Karpaty, when he was still at OpenAI. He was on the Microsoft stage talking, aj Karpaty, when he was still at OpenAI, he was on the Microsoft stage talking a lot about LAMA, the LAMA architecture, how the LAMA model has been trained, instead of, for example, gpt. So the fact that they are releasing, yeah, the weights is very cool, also the code around, but also the research paper and OpenAI is not doing this anymore, like we don't know yet the technical details, so gpt4, while we know everything about llama that's true, that's true I agree with everything that you're saying, but at the same time I wonder, like what is their strategy?

Speaker 2: 21:37

on this because, I mean, zuckerberg has this, this, he has stated this before. I'm very much paraphrasing here, but, like this is about uh, he's opening this to the public to democratize powerful ai and to uh for also to understand, like what's about the impact can be for uh to to understand what you can, what you can and should do around safety on these things, um, but at the end of the day, I mean meta is in the business of making money like like like every every business is maybe in the evening and zuckerberg thinks like maybe there's an ethical point to this as well, but from nine to five, they're simply making money.

Speaker 2: 22:14

So what is it? What is the? What is the strategy behind this? And I think they're relevant in this space only because it is like if they did anything close, meta probably would not be a talking point today, exactly, or maybe that would be. Oh, but yeah, where are you getting the data? Is it from my instagram posts, like if we there would be a complete, if it was not open, that we would have a completely different discussions on this yeah, 100 like otherwise they would have been the third after open ai, google anthropic and like putting themselves out there in the market.

Speaker 2: 22:46

How they're doing it is very smart. They're a big player today.

Speaker 1: 22:49

But I feel like Meta was really invested. It was Facebook and I feel like they were taking a shift and they really bet on the Metaverse. They even changed the name of the company to Meta. They had the goggles and Apple Vision as well, the Vision Pro. There was a big announcement and I think I heard on a podcast not that long ago that people are reviewing now again the Apple Vision Pro. Like who has it? Do people still use it? People like mix a bit, you know, like okay, it's a bit clunky, it's a bit heavy, you know, and no one really I mean at least in my circle, no one really talks about the metaverse these days, right? So I feel like, indeed, if meta wasn't releasing open source models, because the bet that they put on the metaverse, it's really not really paying off.

Speaker 3: 23:37

Also, but timing, like they decided to switch completely to the metaverse when Genai and then LLMs were exploding. Yeah, indeed, switch completely to the metaverse when gen, ai and then llms were exploding.

Speaker 1: 23:49

So yeah, I did uh but uh, so but cool for us. Yeah, I still think they still think it's positive, right like for, for the for the community?

Speaker 2: 23:54

yeah, definitely, and like that's what I'm saying, I'm not disagreeing with fitan. I think they're. They're bringing a lot of transparency to the, to the community and I still just don't truly understand what's in it for them, the way that they're positioning it.

Speaker 3: 24:07

Maybe this is going to cripple Microsoft Now they will integrate actually Lama Tree, lama Multimodal, and also some generative models for images on their products, like WhatsApp, messenger, instagram, so you will chat with your own assistant on WhatsApp itself.

Speaker 2: 24:35

Okay. So I think Is it already there or they're planning to do it In the US. They are releasing it at the moment and they will soon arrive. And what can the assistant do for you? It's like having chat gpt within, basically, your um smartphone in your whatsapp chat and then I guess they will also include some integrations for all the other services they they offer so that means that pretty soon that everybody will start sending janea generated images on whatsapp yeah, really likely.

Speaker 1: 25:05

It's a cringe or like uh, so the stickers will be gene I generated, or it'll be like the, the email. Now you know like, can you ask this person this in a nice way, or you want to invite your uncle? For thanksgiving like these very formal messages and not just five words without any capital three paragraphs some very formal text every, every whatsapp message you come with, dear sir dear bracket open name of the person. Bracket close yeah, let's see. Actually, like in Brazil as well, whatsapp is used a lot like for businesses.

Speaker 1: 25:41

There's a lot of stuff, a lot of features on whatsapp almost everywhere not in Belgium not a lot in Italy it's like the most but like, even like carousels, for, like I don't know, you can swipe around and look at a menu or whatever you know. Yeah, but like there are some things like that. I want to say again maybe I'm I know there was something that was like wow, I never seen that before. I've seen the chatbots.

Speaker 2: 26:03

Port, for example, has a chatbot, like you can say yeah, but that's, that's like that I see for businesses like they.

Speaker 1: 26:09

Yeah, yeah which I think is nice. More on jai as well, we talked last week, I want to say, about rag, the retrieval, augmented generation. Yes, rag to riches, that's what I remember. Um and tim linders, I want to say shared this open parse. I thought it was quite interesting. Just it's an open source package written in Python.

Speaker 1: 26:38

But basically last week we talked, when we're talking about rag, we mentioned how part of the rag application, let's say, you have the vector databases and usually if you have a big chunk of text, we well, not a chunk of text, we have a big document with a lot of text, we chunk it right and then basically you fetch it. This package improves that a bit. So basically, the idea is that it's more human-like chunking. So if you have maybe I'll go on there GitHub page, share the documentation, share this tab instead. The idea is if you have a PDF, right, usually, well, actually, first question is when you have a PDF, you have to convert it to strings. How does Python do it?

Speaker 1: 27:21

It probably is just going to be an OCR, but that doesn't mean that the text is actually going to work. But that doesn't mean that the text is actually going to work. If you have two columns. Maybe you won't identify the two columns, maybe you will right. There are things that can go wrong and also like paragraphs, tables, all these things right. So what this package does? It just has a smarter way of chunking text and on the image here, if you're following the live stream, you're going to see that there is a document with overview, driver, classification, development, and then a table, so three segments, and then they chunk it according to the segments, right.

Speaker 2: 27:55

So, instead of, let's say, the dumb way to do this, to chunk a text is like to say, after 200 characters, cut it, and then that is chunk one, then the next 200 is chunk two, exactly, but like there's no logic to it. And here, like this, what you have on the screen is like more or less you, you know, like this is a. This is a paragraph that is my. That is chunk one. Next paragraph that is chunk two exactly so.

Speaker 1: 28:18

It has like the locality of it, which is how a human would also read it right, and I think it avoids things like having, like you said, half of a chunk here and the other half of the chunk on the other paragraph. Okay, right so it looks interesting. I think it was pretty cool. Uh, I haven't looked so much into it, but I would imagine that this actually improves the performance yeah, indeed yeah, because I'm uh.

Speaker 2: 28:40

We're actually using this not for for chunking something like like this, like to go from pdf to uh text, um for, but for genii purposes, and the, let's say, quote unquote dumb way to do is to just to go to flat text. Right, yeah, what we actually do is that we have a lot of metadata like this text comes from um, this uh position, uh, on that page, um, it is uh this typing so that you can see, like it's a bold font face, it's probably probably a title. So we have a lot of metadata, um, which is good to have a rich expression of what is in this text, but it's also a lot, a lot of tokens and the example that you're actually now showing on the screen like it's, it's a bit the same, it is the same context, but it makes a lot of abstractions. Yeah, so you just go this is a chink, this is a chink, this is a chink, and you basically ease the processing for this, for the LLM as well.

Speaker 1: 29:34

True.

Speaker 2: 29:34

So you have a lighter input, but maybe still the same richness of the context.

Speaker 1: 29:40

Yeah, indeed, you're doing this today. You said yeah, is it okay to share a bit more about it? What it is Top secret. You have to kill me if you tell me.

Speaker 2: 29:49

No, it's actually quite simple, but it's for a rack-ish approach for a customer, so you can't share with a customer and they are processing a lot of documents PDF documents that they need to add to a file of a customer in a structured manner.

Speaker 2: 30:10

But for that we need to extract the text of the documents. But just the flat text is not sufficient enough to generate the correct summaries of that, and so we use a package and I want to say it's called PDF2JSON, but it's something like that, probably not completely like that and that gives you a huge amount of metadata, which I think is good for for the rest of the problem, but probably a bit too much to actually put all of this in the prompt.

Speaker 1: 30:34

I see, and then, like you're saying, this package, the geographical-ness would already be another add-on, yeah.

Speaker 3: 30:42

Bart, I have two questions related to this. Do you think the PDF format is a bit outdated for nowadays world, where unstructured information will be processed more and more?

Speaker 2: 30:58

in the coming future.

Speaker 3: 30:59

And we need to still rely on this technology, let's say, introduced a few years ago already. Maybe there is a better way to structure information in this kind of document.

Speaker 2: 31:10

Introduced. How old is the PDF? I think it's. I've never known it not to exist. Yeah so it's very clunky. Like the spec is super, super. You don't want to get into the nitty gritty details of the PDF spec. Yes, it's outdated, but I think it's also still going to be around here 10 years from now. That's a bit, but I agree with you. Like it's. Ideally we don't use PDF. Yeah, I agree.

Speaker 1: 31:41

Initial release June 15, 1993. Oh wow.

Speaker 3: 31:45

So it's 31.

Speaker 1: 31:49

That's young, right right, bart, it's a baby. Vitalis, oh, actually, no, 30 years. Is it old? Vitalis, 30 years old?

Speaker 3: 31:59

no right, yeah, not at all and also the second question part is that now we have all these multimodal models that can process visual information, including text, For example, for use cases like PDF, processing understanding the information, let's say, within PDF. Do you think it will be a better approach to process information with this kind of models instead of getting the raw text and then feeding an llm?

Speaker 2: 32:37

well, I just said it's a good question, but I've been laughed at before saying this. But, uh, maybe just go one step back, like pdfs, I mean, I think they're here to stay for the foreseeable future. I think there's a. It's the easiest way to send something. That is okay, I'm gonna send you an email. I'm gonna send you this and like this, like a document of multiple pages, like in a clean, structured manner.

Speaker 1: 33:01

I think that is like read only, though that's the thing that is read only.

Speaker 2: 33:04

Yeah, that is read only and that is easily generatable by a lot of different people and a lot of different systems. Like should you probably send this data to an API? Yes, but I mean like.

Speaker 1: 33:15

that's not reality, Even presentations, right? Like if you have a PowerPoint presentation, I've seen people convert it to PDF to share with people as well, because it's just a cross-platform.

Speaker 3: 33:25

It's like a sea of standard, but that's the thing like.

Speaker 2: 33:29

I've seen you give JavaScript based presentations, but from the moment that someone asks you, mariela, can I have a copy? You're gonna somehow make a PDF out of that, right.

Speaker 1: 33:36

Yeah, you're not gonna make an image either.

Speaker 2: 33:38

No, no that is sad. I think the other question.

Speaker 1: 33:43

You're still with a bit of judgment. Huh, I've seen you make JavaScript presentations like oh.

Speaker 2: 33:47

No, but I've done it myself as well and I and I think it's very cool and it's very uh bit, uh bit edgy, bit quirky and a bit contrarian, and but I think it's very, it's very cool, but I've done it myself as well. But I've always ended up with the question oh yeah, but can we have a copy of the presentation? And I think, okay, yeah, shit, okay, I'm gonna do this, okay, there's a way to. Okay, there is this package and this package makes pictures of the web pages and that will convert that into a pdf. And you do that 10 times and 11 times and you think, okay, let's just use google slides because then like, is the presentation ready?

Speaker 3: 34:22

you know, I have a bug, I need to. No, I, I hear you I hear you, I, you.

Speaker 1: 34:28

I feel like I don't use the JavaScript presentations for any presentation either. It's usually for, like, if I'm doing a tech presentation, that I'm not expecting anyone to ask me anything, you know. But I'm going to work alone as well because I think all these things.

Speaker 2: 34:43

Yeah, I still use it here and there as well. But anyway, I think what was your question. Can you reiterate?

Speaker 3: 34:51

No one was about the PDFs, and the other one is instead of taking the raw text from the, let's say, visual of the PDF. So you have the table, you have some locality. Maybe you have a picture close to a table. For example, is it more efficient maybe to directly use the multi-modality features of new models and give the image itself as an input?

Speaker 2: 35:19

I think today, but now I'm just going off my experience. I think it's more efficient, for I have this small image with some text in it. I think that works pretty good. I think what doesn't work is I have here this 30-page document. I'm not going to OCR it, I'm just going to send images of these pages and now extract the data. I think that doesn't work very well. I think the performance is much, much better with the 30 pages. If you do a text extraction and you inject that text into the prompt for for small things. I think the performance is there and probably when we discuss this six months from now, we'll be in a different stage.

Speaker 1: 35:59

Yeah, Um no, I was going to ask you cause you mentioned performance, right, and I think like when I was showing this package, my first thought was like, oh yeah, if I ever had to do a RAG application or something I have to chunk text, I'll probably use that. But I haven't even looked at performance numbers. I don't have any guarantees that actually this will improve performance, right, and I'm just wondering if I mean, is it just like assuming that the performance is negligible? The performance increase, would you still use this? Because I still think there is a more explainability aspect to it which I feel like it's nicer in my brain when I think how is this organized? Like I feel like that's the way it should be do you mean?

Speaker 1: 36:39

a rack is the way it should be no, like the open parsing, the way you chunk documents okay.

Speaker 2: 36:44

Well, I think it all comes down to having good tests, yeah, but like to actually have a, have a performance, and that you know if there's a change.

Speaker 1: 36:51

If you yeah, because to me but if the performance is like kind of the same, I would still use it. Probably, like I'll still use this thing, even though it's like a more complex right, but just because I feel like in my brain, like I don't know if I don't think that's a good argument yeah, that's the thing I don't think either, but that's uh. But for you then, like you have this project, you try this open parse thing. The performance is about the same.

Speaker 2: 37:14

You just like, yeah, no scrap it, just keep doing what we're doing now uh, that's a very hypothetical scenario, right like if try, if I went all the way to re-implement something in a new in a new technology and the performance is the same. I would probably keep it if I went all the way.

Speaker 1: 37:36

No, no, but like if you start a new project today, you try this, but like it's still in the beginning. So you try both ways.

Speaker 2: 37:41

Yeah.

Speaker 1: 37:42

And you see it's kind of the same. Early results show it's kind of the same. Would you still make the effort to maintain something?

Speaker 2: 37:48

I think it's up to the person's building, because then it doesn't matter yeah maybe this is an extra dependency on a package that is not mature and didn't look at it the stats, but it's also like these type of things come into play and then like if everything else is the same yeah, it's also true that, uh, the context length of LLMs is increasing Exactly.

Speaker 3: 38:10

Like Lama3 is now 8,000 tokens against 4,000 from Lama2. So what if you ingest the full page, for example, or multiple?

Speaker 2: 38:21

pages. Well, we're talking with what is the context length of Cloud3? It's over 100,000. I'm looking it up the CloudTree Opus is a context one of 200,000 tokens, expanded to 1 million for specific use cases. So you can even say and maybe I'm going to ask the question to you, marilo, let us assume performance is the same. Are you guys going to do like I'm going to have a very simple search? I'm gonna like, uh, inject everything in my prompt, assuming that it's not a huge, huge knowledge base, but like 1 million tokens is a lot, right, um, or I'm gonna gonna use rack, because then you know what is happening yeah, no, I feel like in that case, well, I think then depends a bit, right, like because if to use these things, the infrastructure needs to be different and, I don't know, more expensive.

Speaker 1: 39:11

I'll probably look more into that, yeah, but I feel like if cost, infrastructure, maintainability, all these things are not uneditable in a way or not in the forefront of our mind, then I would just include everything. I wouldn't chunk stuff, you know. Then I would just include everything I wouldn't chunk stuff, you know.

Speaker 2: 39:34

But I still feel like maintainability, uh, the infrastructure, all these things would probably be relevant yeah, but you can you get into these, these, these uh thought process, a bit like you were explaining where you think because it's right, because I'm gonna fetch more specific information, it's probably gonna be better. So I'm gonna try that.

Speaker 1: 39:52

Yeah, that's true, that's true but I feel like that's where, like last week as well, I also shared that I heard people saying data science is like an art more than a science, and that's I. That's kind of what I mean. You know, like you have this intuition, you have something. It's not something super concrete, but it is like a trail of thought. You know, like that you can articulate, you can, but you know it's like you can't quite. You put your your hand on it, but you can kind of kind of feel it.

Speaker 2: 40:21

You know what I'm saying so you're saying it's a black box and so you need to have a decent test.

Speaker 1: 40:25

Well, I could say that too. That's not what I'm saying, but I think that's a very and then you need to use ml flow where you go to.

Speaker 3: 40:32

Of course, that's your family questions.

Speaker 1: 40:35

I have my person right here, so are we okay to move on? Uh, yeah, yeah, yep. Another thing that is timely so I would like to cover today. We mentioned yama 3, which is openly available, and so what does it mean? It's open source or not? There was also some waves in the open source community, namely with terraform and open tofu, and I think last week I want to say maybe two weeks ago, not that long ago there was some drama there, and I like drama. I'm a latino, I'm brazilian, so I'm attracted to that, and what was the drama regarding? I don't know what we'll keep for later I'll be scared.

Speaker 1: 41:23

Um, what was the drama regarding terraform? So terra, terraform? They changed their license right and some people said, okay, we don't like this change of license, we want Terraform to be open source, so we're going to fork it, which means we're going to make a copy of what is there today, a snapshot of what it is, until it's open source, and we're going to keep doing our own thing. And that became OpenTOFU. So very paraphrasing here. Opentofu so very paraphrasing here. Open Tofu was promised, quote unquote, to be an equivalent of Terraform, or at least it was right At the moment it was forked.

Speaker 2: 41:56

It was the same At the moment it was forked. Yes, I don't think they promised to be.

Speaker 1: 42:00

No, no, well, I don't think that were explicit promises To remain compatibility. No, I don't think that was, but I do think that that was the the mentality behind.

Speaker 2: 42:13

yeah, um basically without jumping the gun.

Speaker 1: 42:23

Terraform released new features, so open tofu was quick to recreate those features. I guess reimplement them yeah, reimplementing is a better, it's a better way to phrase it Right, which kind of gives that that the feature parity kind of thing between open TOEFL and Terraform and maybe actually share this other link. First, yeah, terraform sued OpenTofu, claiming that they violated the license. So right now, the license on Terraform, and again, correct me if I'm wrong, but I think the Terraform license says that it's free to use unless you're a competitor of Terraform. So we can use Terraform for our clients because the solutions that we're building are not competing with Terraform. And again there's some gray zones like what is it to compete? Right, if I make a plugin and now we have 100 users? Are we competing or not? Okay. But then now, yeah, with this new license, they're claiming that OpenTOEFL violated the terms of service, right. And then there's the whole discussion here.

Speaker 1: 43:27

This is a blog article from Met essay. I want to say um, that open tofu may be showing is the wrong way to fork, right. Disagree with that. So the subtitle is disagree with the license for the project, but don't lift the code and say it was always publicly available. Compare hashicorp's code to like to compare hashicorp's code and license to open tofu's version right. Let's see here.

Speaker 2: 43:47

There was also the I think they had, and what is the exact reason that Terraform is now suing them?

Speaker 1: 43:51

Because of feature parity, basically.

Speaker 2: 43:52

Because of feature parity.

Speaker 1: 43:53

Yes, yes, yes, so basically there was a.

Speaker 2: 43:57

So they're saying we implemented this new feature on our branch which you can no longer use On an old version that you forked you. Now also, you also audit this feature.

Speaker 1: 44:11

It's a new which is comparable yeah, so it's like the source was the same, and then now terraform has this new feature that they released in this new version but they now open.

Speaker 2: 44:18

Tofu has a similar feature but it's only similar features, not like it's the same code or anything like they just so that's when you get to the nitty-gritty, okay, right.

Speaker 1: 44:25

So first thing that I will say is that open tofu acknowledged, yeah, that they received this. Uh, I think it's like cease and desist I think that was the again not to share the screen here. Open tofu was close. This, okay, never mind.

Speaker 1: 44:42

Open tofu was recently made aware of the letter hash corpse lawyers. Basically, they're denying that they violated the license and then they even in the open, they did a more thorough analysis here, right. So really, open tofu has our response to hash corpse cease and desist letter. So they really went from line by line and they started to see what is the same, what is not the same, what variable names are the same, what files are the same and all those things, right. And then I did like the communities a bit split, right. There was someone that was mentioning like three levels of copying code. Like you can literally just copy code. You can re-implement with the same, like a different way, right. But like you really implement trying to do the exact same thing, or you can look at it and just be inspired by some features and do something similar yet related, right. So at which point are we actually violating the license?

Speaker 2: 45:34

And they're arguing that they did the latter.

Speaker 1: 45:36

They were inspired by but did their own implementation. Yeah, but they did their own implementation. But it's very hard to say right?

Speaker 2: 45:44

When the code is available. Would that be okay? Well, that's what the that's what the lawyer will figure out. The result of this will say right.

Speaker 1: 45:50

Yeah.

Speaker 2: 45:51

Interesting.

Speaker 1: 45:52

So there is this and also the other Today, for me, at least in my head, when I think of OpenTOFU, I think of an open source version of Terraform, but there's clearly some noise already because they are the same thing quote, unquote, right? Um, and then there was even the discussion of what's going to be the future of open tofu. Right, because maybe they should stop looking at terraform at all and they should just try to do their own thing. They should have a vision of a different project instead of just being an open source terraform. And also this is if they are trying to be an open source Terraform, that's a losing battle for sure. Right, because Terraform is allowed to copy code from OpenTOFU, because OpenTOFU is open source, right? So again, is it going to be two versions of the same, two different versions of the same thing? Probably not the way to go, but I'm curious to hear what you think. Why.

Speaker 3: 46:50

Because they claim they want to be the open source Terraform basically. So investing time and energy to replicate something that exists, just to release it with another license, that's a really lost battle, I think.

Speaker 1: 47:08

Yeah, but I do think also like the project should be independent, right. I want to be the best infrastructure as a code open source tool, but then it won't be an open source Terraform, right. It will be an open source infrastructure, as code which will be different from Terraform would have different features. It would have, you know if every time I think, releases something and they copy.

Speaker 3: 47:31

Yeah, no, I agree, yeah, I think that's also very fishy.

Speaker 1: 47:33

Yeah, that's the whole point of terraform, like changing their license as well.

Speaker 2: 47:37

Right, yeah, it's difficult. I mean, you're gonna say that's a difficult one.

Speaker 2: 47:44

The whole thing how this came to be is that Terraform changed their license. That's so it didn't allow direct competitors basically anymore to use it for directly competing services. So HashiCorp has services to manage your environment, manage services that you can, where you they leverage have a very heavily leveraged terraform um to uh, manage your environment and quote, unquote competitors, some only partially, some to large extent like like grunt work, like and zero space lift. They join forces and they create a fork of terraform because they basically are competitors to terraforms, to hashicorp's managed services. So it's already uh, like how it came to be. It's not just we in our hearts believe that it needs to be an actual open source license, no, it's like. The pain point is like we can't use it anymore and this is key to our business. We've built a business around what you built.

Speaker 1: 48:47

Yeah.

Speaker 2: 48:52

And thus we need to do something, and the result is that fork yeah, I mean, that's what you see in the OpenTOFU. I think it's a foundation right. I mean, the big players there are, to some extent, competitors of Terraform, true, and that's very questionable how viable this is, and I think another topic that will be important in OpenTOFU's adoption going forward whether or not it will be a success is also that at some point, you need to migrate away from Terraform to OpenTOFU and people that have a complex infrastructure for which you use things like Terraform typically. To migrate away from that it's not that easy, but there is like this small window of opportunity where, if you used the version where it was forked, or very closely to it, later or before, it is binary compatible and it's like just you, you, you switch the.

Speaker 2: 49:47

CLI to the to the open TOEFL CLI, you can use it, but the longer you're waiting, the less of compatibility is and the bigger the risk of bringing into risk and making it an actual big migration project, for which there are very little arguments to make If you are a large enterprise and you're already using as you corporate but also it's like again if you're not competing with terraform, you are free to use the license.

Speaker 1: 50:12

I guess the only fear is that if, if hashicorp changes the license again, right, or but what I mean today, what arguments do you have for not going for terraform, even if you have Terraform and OpenTofu?

Speaker 2: 50:29

As an individual developer. I think the only way as an individual developer that doesn't care about sustainability and professional support and all these things as an individual developer, is that there is this very strong belief in that whatever you use needs to be open source.

Speaker 1: 50:42

I think that is the only very strong argument to make Like personal values, like you believe in open source. You want to be open source? Yeah, but then that is the only very strong argument to make like personal values, like you believe in open source you want to share. Yeah, I agree. Um, yeah, I agree, I completely agree with you.

Speaker 2: 50:53

I mean from if you are a business and they need to decide on, am I gonna use open to a terraform? Then your next thing is going to be like what is the longevity of this? What is the community behind this? Can I get a professional, long-term support from this company? Yeah, and you're not gonna say yes to.

Speaker 1: 51:08

I mean you're gonna have to have very strong arguments not to go to terraform yeah, yeah, I completely agree, and I think again, this may make things even a bit more complex in a way, because this is a one-way street, right like terraform. Terraform to OpenTOFU that's what this CCNC is about, but OpenTOFU to Terraform is always going to be a viable option for them because OpenTOFU is open source.

Speaker 2: 51:34

OpenTOFU to Terraform is going to be a viable option. In what sense do you mean For?

Speaker 1: 51:38

features. For example, there's a very brilliant feature that they implement on OpenTOFU. For them to just kind of take that feature and reimplement it, or even copy-paste, if it's possible, if the API is still similar enough, that's fine. It's for a game, right? So then that's, in a way, the way Terraform can position themselves is that it's going to be a superset of OpenTOFU.

Speaker 2: 51:58

I like that. If they extract it and still have the license for that component, yeah, they can use it.

Speaker 1: 52:04

Yeah, but even if OpenTOFU creates a new one today, because it's open source and anyone can use it, right? So Terraform will be a….

Speaker 2: 52:10

Under the terms of the license.

Speaker 1: 52:11

Right, yeah, but OpenTOFU will always be a subset of what Terraform is in terms of features.

Speaker 2: 52:19

Well, there's maybe a hot take.

Speaker 1: 52:21

Maybe a hot take okay.

Speaker 3: 52:24

Oh, hot, hot, hot, hot, Hot, hot, hot, hot, hot hot.

Speaker 1: 52:28

Sorry, sorry.

Speaker 2: 52:32

All right, I came across this sample, I couldn't not add it. All right, but maybe, as we're nearing- I actually hope that other people hear this, but samples should be fine. These samples, these should be fine. The only question is if we From the laptop, from the laptop, yeah, yeah, yeah, yeah.

Speaker 1: 52:52

I think the samples are fine, but I'm talking about hot ticks. Maybe it's time for a hot tick.

Speaker 2: 52:59

Go for it.

Speaker 1: 53:04

Well, I'll pick yours. Actually, today, you always pick mine because I have mine. I just put them for a rainy day, you know, when I'm really busy, I don't have time and I just slap it in there. The future of AI gadgets is just phones. What do you have to say about that, bart? By the way, the soundbite when, when you envisioned it, were you envisioning it for the hot take section. So just like, like right now.

Speaker 1: 53:31

For when someone yeah for example, like the hot hot hot. So can you please, alex, would you mind? Oh, hot, hot, hot, hot, hot, hot, hot, hot, hot, hot. Is this from where is this from actually?

Speaker 2: 53:41

Well, when is this from actually? Well, I burned my hand the other day when I was frying bacon.

Speaker 1: 53:47

You were FaceTiming someone.

Speaker 2: 53:48

Yeah, and I was like I'm going to burn myself.

Speaker 1: 53:52

Let's record this, okay.

Speaker 2: 53:57

Then I got this out of my sample vault. It's under my sofa at home.

Speaker 1: 54:00

It's just like you have it there on a safe yeah for a rainy day right um glad you shared that one with the no, but the future of ai gadgets is just phones.

Speaker 2: 54:10

It's actually not my hot take, it's. It's a hot take by allison johnson, a uh writer, for uh diverge vitale. What do you think? What is your opinion on the Humane's AI pin? I think it's called right. Or the Rabbit R1? Like do these have a future? Or are you saying like, no, my phone is the future?

Speaker 3: 54:39

I think they do. Okay, I think they do.

Speaker 2: 54:42

Okay.

Speaker 3: 54:42

Only when the small LLMs or small Gen AI models would be performant enough and energy efficient enough to run locally on a wearable, instead of having the need to have a connection and perform requests to an endpoint, a remote endpoint. Why? Because then maybe you can have specific systems, specific tools, specific gadgets that you want to carry with you, maybe in particular situations or even in your everyday life, and you need a snap response. So you ask a question, you touch a button and you ask a question, and you don't need to take your phone, take your application, unlock your phone and the model is running locally. So it's immediate. Basically the response to your question there. Yes, there can be a future, but how they are implemented now?

Speaker 3: 55:36

they need to rely on external services they need to call open ai, they need to process the response and then the interaction is less human, less interactive.

Speaker 2: 55:47

So I think the use case that you're mentioning, like, uh, there, you can just press the button and you have a very quick response. I think you can already make the argument that there's enough compute in a phone to do that right. And I think in this article they make the argument that what we have next to our phone, if you don't want to take your phone out, we already have earbuds which are widely accepted, and especially with the pass-through of the airpod pros, for example. Something else has this like you see people walking around with it all day having conversations and no one really thinks like this is weird. So there's already something accepted, so why not use that? I think that's a bit of the statement that we're yeah, I think also there's a.

Speaker 1: 56:27

There's a picture here. I thought it was funny because, well, maybe for the people that I didn't know about the human ai pin kind of thing uh, yeah, I didn't know about that, I don't have the, I don't have the budget for that, so I don't even look at those things.

Speaker 3: 56:38

Okay, okay I mean I know for his different parts.

Speaker 1: 56:40

But, uh, you know, some of us gotta gotta eat. You know what I'm saying?

Speaker 1: 56:44

um, anyways, some of us read the news um, the human ai pin review so I mean this from what I read here is basically a wearable device that the idea is that you can use it like a pin, you put it on your on your chest and then you follow around, has a camera. Then you can just tap into it and you can, I guess, talk to it and ask questions. So it's a should be a sleek wearable device for ai, I guess. So it's the way. So this is one in the rabbit r1. It is something that we shared a while ago. I want to say uh, pre alex times, um, which is like a very completely different redesign uh device for uh, they call the large action models. Actually I'm haven't heard anything about it, because I remember people that ordered they should have received it already, because I think that's a good question.

Speaker 2: 57:33

We should research it a bit yeah but the first results are if they're already shipped, I think well, it says new orders ship on june 2024, but I think the first one is no, no, but this is new orders, I think the first ones.

Speaker 1: 57:45

I remember that by Easter I should have had it if I were to order it. Then I didn't do it. I'm too frugal for that. Again, like I said, I don't have the budget anyways. So two different devices. The idea is that they redes saying well, if you take those foldable phones, you can just kind of clip it on your shirt and you kind of have the same thing as the dot has. So again, a lot of the stuff that they're promising with these things yeah and they uh what your main pin has the latest version.

Speaker 2: 58:15

Actually, it looks like a laser-based screen, so you hold up your hand and it's like it projects an image on your hand oh really like there's a lot of effort if you can just like take your phone out of your pocket, right?

Speaker 1: 58:27

yeah, that's true. I'm also wondering, like with the like the ray-ban glasses from meta, you know, like how you had, I don't know if you can project something in the lenses of the glass or something because I know they're not only you have a camera?

Speaker 1: 58:41

no, okay but I know that, like cars, sometimes on the dashboard you know, like the like I don't know, actually I have never been in a car like that, but I heard from people like you can have the the dashboard, like with the speedometer and everything in the the windshield, but it only the driver sees it because of the way that the light hits. Yeah, right, so you can see that and you see still the background and everything. So so I'm wondering if you have glasses or something instead of projecting. It seems like an easier. One last thing as well on that is this Can you computer by? I don't know if you ever.

Speaker 1: 59:14

The website is very minimal actually, and they had more information about this before. I'll make this a bit bigger for the people that are in the live stream. Basically, first product is DOTA Highly Personalized Intelligent Guide for iOS. Dota is currently closed beta and you can sign up in the waitlist here. So I have signed up for the waitlist. I haven't heard from them, but, as I understood, it's like an operating system that will work on iPhones, or maybe it's an app or something, but it's also kind of redesigned for LLMs and for AI. So I'm also wondering like there is an in-between right, maybe you don't need a new gadget. Maybe you can just take the phone and reuse it like, instead of Siri, you have just something that is full AI so this would just be a functional Siri maybe something like that, but it's like the.

Speaker 1: 1:00:01

The rabbit R1 kind of reminded me a bit of that right, because they called it like they made a big fuss. The large action models and all these things.

Speaker 2: 1:00:07

Yeah, it's true.

Speaker 1: 1:00:08

And it's like they have APIs and they can make actions for you. They can order a looper, they can do this, they connect.

Speaker 2: 1:00:13

It's what you want Siri to do for you Exactly. And instead of this, if I have to like, do it like five times, because every time like there's a typo in there and like it's horrible.

Speaker 1: 1:00:28

Siri open WhatsApp. Hey, what's up? What's up with you? Yeah?

Speaker 2: 1:00:35

So we have this take. What was the official take? Again, the future of AI gadgets is just phones. Let's take a timeline here Coming five years Vitale, is it just phones, or will take a timeline here coming five years vitale is it just phones or will be hardware?

Speaker 3: 1:00:48

so do you also, uh, comprehend different? Do you want me to? Because I think the the existing form factors like earplugs are, you're saying, smart watches.

Speaker 2: 1:01:00

We will see a lot of integrations of on-board ai models there but like there's no nuance right, like there's like just a blank, so like the future of ai gadgets is just phones but I'll just say I think I think no, I think okay okay, no

Speaker 1: 1:01:14

no, five years coming, five years, five years?

Speaker 2: 1:01:16

yeah, I think so yeah, I think, I also think yeah I think yeah, okay I think five years. We're gonna sit together gonna discuss this topic.

Speaker 1: 1:01:23

I'll be like I will have my ai you will no longer speak.

Speaker 3: 1:01:27

It's like, it's like a speaker in this tooth okay, there we have it well.

Speaker 1: 1:01:36

If anyone, maybe we can do a little call out. I think it's the first time we do it, huh, but if anyone has any thoughts, maybe we should watch a poll or something. No, yeah but like it would be sad if only two people reply.

Speaker 2: 1:01:48

It would be a bit the other hot takes you're not going to do today.

Speaker 1: 1:01:52

No one per day Bart.

Speaker 2: 1:01:53

One per day.

Speaker 1: 1:01:53

One per week. One per week Because what if we run out of hot takes?

Speaker 2: 1:01:56

Because you put a lot there right, Well, three yeah. I thought it was going to be a hot day today.

Speaker 1: 1:02:02

No, not that hot. I'll wait for summer for that, you know. But I think this is it. I'll leave it at that. Anyone else has any final words?

Speaker 3: 1:02:13

Thanks for having me here.

Speaker 2: 1:02:17

Thanks for being here, Vitaly.

Speaker 1: 1:02:18

Thanks for joining us. Indeed, thanks a lot.

Speaker 2: 1:02:21

Thanks everybody for listening.

Speaker 1: 1:02:22

Thank you all, ciao.

Speaker 3: 1:02:32

You have taste in a way that's meaningful to software people. Hello, I'm Bill Gates. I don't know what this is.

Speaker 2: 1:02:37

I would recommend TypeScript. Yeah, it writes a lot of code for me and usually it's slightly wrong. I'm reminded, incidentally, of Rust here Rust, rust.

Speaker 3: 1:02:49

Congressman, iphone is made by a different company and so you know, you will not learn Rust while skydiving. Well, I'm sorry, guys, I don't know what's going on. Thank you for the opportunity to speak to you today about large neural networks. It's really an honor to be here.

Speaker 2: 1:03:04

Rust Data topics honor to be here.

Speaker 3: 1:03:05

Rust, rust, rust, rust. Data Topics. Welcome to the Data Topics. Welcome to the Data Topics podcast.

Speaker 1: 1:03:11

That was not satisfying. Okay, that was a bit better. Thank you.

DataTopics Unplugged

#47 MLFlow, Llama 3 Unleashed, and the OpenTofu vs. Terraform Drama

Listen to this podcast on