DataTopics Unplugged

#35 Mamba to Challenge the Transformer Architecture?

February 06, 2024 DataTopics
DataTopics Unplugged
#35 Mamba to Challenge the Transformer Architecture?
Show Notes Transcript Chapter Markers

Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society.

Dive into conversations that should flow as smoothly as your morning coffee (but don't), where industry insights meet laid-back banter. Whether you're a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let's get into the heart of data, unplugged style!

In this episode, "Mamba to Challenge the Transformer Architecture?", we delve into a spectrum of tech topics and scrutinize the facets of productivity in the engineering realm. Here's a sneak peek of what we're discussing today:

  • Checkout or Switch: Exploring new tech tools and their necessity. For further reading, check out this article.
  • Mamba: A Viable Replacement for Transformers?: Discussing Mamba's potential to replace Transformer models in AI, and the paradigm shift it represents. Learn more in this LinkedIn post and look forward to its presentation at the ICLR 2024 conference.
  • Biggest Productivity Killers in the Engineering Industry: Identifying the top productivity obstacles in engineering, including perfectionism, procrastination, and context-switching. Dive deeper into the topic here.
  • Meta's Copyright Contradiction: Analyzing Meta's approach to copyright law in protecting its AI model while contesting similar protections for others. More on this discussion can be found here.
  • Speeding Up Postgres Analytical Queries: Showcasing how pg_analytics enhances Postgres analytical queries by 94x, posing a question on the need for other tools. For more insights, visit this blog post.

Intro music courtesy of fesliyanstudios.com

Speaker 1:

Good girl.

Speaker 2:

Ready, I'm going to hit it.

Speaker 1:

Hit it, marilo. I'll hit it, marilo, please. Yes, I'm the captain now Hello.

Speaker 2:

Hello and welcome to Data Topics, Unplugged, casual corner of the web where we discuss what's new in data every week, from copyrights to productivity, anything goes. We also on YouTube, linkedin, twitch, all the above, anything. Linkedin, apple podcast yeah, yeah. So if you want to join us in the live stream as well, feel free to come in. Leave a question. Bart is the sound engineer, the one always there listening, looking out, so feel free to leave your question. Today is the 2nd of February of 2024. My name is Marilo and I'm joined by the one and only Bart, hi, so, and today we don't have any guests, unfortunately.

Speaker 1:

Yeah, but it's you and me.

Speaker 2:

You and me today, but we have some exciting guests in the coming weeks. So cool, cool, cool. So what do we have to start with? Meta used copyright to protect its AI model, but it argues against the law for everyone else. What's it?

Speaker 1:

about Bart. It's a small article came across Link everything in show notes and it's not really news like things that happened over the past, where they argue a little bit like isn't it a bit weird? What Meta is doing here and what they more or less state is that all the data they used to train Lama, that this is under fair reuse policy. We all know that there is a lot of discussion. No matter what, this is fair use, right? So there they say this is under the fair use policy. And at the other side of this is that when at some points last year, lama got leaked, yeah.

Speaker 1:

It landed up on GitHub and Meta. Basically, I think suit is a strong word, but they took legal action against GitHub to take it down. Ah, really, because they said this is our property, we're not allowed to use this or share this with the public. Really, and that's a bit the duality that they present in this.

Speaker 2:

But then this is the same. So what I remember and I just to make sure we're talking about the same thing that there was someone went on a GitHub and made a pull request on the Rimi and put a tour link to say, oh, if you want to install it, like you don't know the weights, instead of going through this forum and everything that Facebook makes you, do you just install this because it's faster, saves your bandwidth.

Speaker 2:

So, basically just made a pull request. It was never accepted. But I think it was a bit funny too, because you're going to comment and there's a whole bunch like this Looks good to me. Looks good to me Like you would have nothing to do with Meta, right?

Speaker 1:

I'm not sure if it's about the source code or about the weights. I think it was a while ago.

Speaker 2:

Because I remember for that.

Speaker 1:

So the weights were never supposed to be on GitHub, right it was supposed to.

Speaker 2:

But like even that, like well, it was on the tour link to download the weights. It was on a pull request on GitHub so you can just kind of copy the I think is the magnet link or something I forgot the name so you could basically download the weights for Yama and yeah, but I think even.

Speaker 1:

But anyway, it was a long time ago, but this is what it's about. It's about weights, or that the initial version of the code leaked. Well, it's not there at public, but where they say, where they basically send a sort of a CSUN and this is too, too hit up to say you're not allowed to share this. Well, on the other hand, this everything is fair use. You can use everything right, and that's a bit to the.

Speaker 2:

Is it possible that Matt is a big company, that there are actually different departments with different philosophies on this, or do you think it's just the the mean sneaky?

Speaker 1:

That's a possibility, but I think in general, like this is this what is fair use in? The LLMH has become very vacant and I think it's very weird, yeah, like there's no one, like for some for some reason, like no one is really lying awake at night on this.

Speaker 2:

Yeah, yeah, yeah, yeah, I think it's also there's no, it's like it's very important issue, but it's not urgent, I guess, and I think that's why that maybe also you get, you get a bit numbed to it. Yeah.

Speaker 1:

Like you hear so much about it and like, okay, yet another new thing, yet another new thing. Oh, wow, these advancements these advancements and you focus on that a bit more, I think, and you get a bit numbed to the ethical considerations on this.

Speaker 2:

Yeah.

Speaker 1:

Well, if you, if you would have said this five years ago like these big companies are gonna just gonna say fuck you to IP rights. Basically, yeah, like if whatever we said, what? Oh no, this can't happen. Yeah, like it's a crazy situation. It's like. It's like and we don't question it anymore. Like there's someone. Yeah, maybe there will be a loss of paper. There will be a trim reference, but it's not really there yet.

Speaker 2:

Yeah, yeah.

Speaker 1:

And it's really like. It's like and I find it funny that it's not more of a talking point Like this is like something that is so, if you would like, if you go back five years and say that, then would be very clear yeah, for sure.

Speaker 1:

And now and now you don't really question it's like you have something weird happening, like I'm saying, oh yeah, my, my, my friends went to ski in the Alps and he got killed by an elephant, and like not questioning that, like it's a bit absurd, like yeah, yeah, shit, yeah, alpine elephants, right, yeah, but like it's a crazy situation, we're not really questioning it. Yeah, and that's a bit of duality, that that's her that posing here in the yeah, that's true.

Speaker 2:

I think also is like the fact that they just did it in like open AI and all these things, and they just did it, and I think the outcome was so shiny that people didn't. Yeah, that's the thing. That's the thing. Yeah, it's like if you make an announcement, a and B and B is a bit bigger than A, maybe it overshadows it, right, and I think that's kind of yeah you even see this with with the, with more negative stuff, like wars, like things that were going on, but and you?

Speaker 1:

also get a bit numb to it yeah, yeah, yeah. But here, the other side of the coin, is like there's also a lot of excitement on this. Yeah, indeed, that's a big difference.

Speaker 2:

I think the excitement is even more outweighed by the wages, by far the whole discussion on what's property right. Yeah, indeed, and I think also, maybe another thing that makes it trickier is that it's the web right. So it's like when you talk about laws and frameworks and all these things, then usually it's there's a government.

Speaker 2:

There's like you know, and I think in the, in the web, the lines get blurrier. So I think it's tricky and I think this thing is also like the pace that people are building new things and doing new things is so fast that sometimes I think it's hard for for lawmakers to catch up and really understand what it does and really do this and that right. So but yeah, I think it's, yeah, it's at least ironic, right. The meta is at least ironic.

Speaker 2:

And, yeah, we hear a lot about AI models or not? We don't really hear a lot about AI models. That's the point actually. Right, we hear a lot about transformers, and we have heard a lot about transformers for a lot, for a while, right, but if you think about it, transformers is not that new, is it?

Speaker 1:

I want to say 2018, I'm not on the show when the paper came out.

Speaker 2:

I think it's 2018, yeah, it's the paper.

Speaker 1:

attention is all you need Attention is all you need From someone on Google.

Speaker 2:

Yeah, from someone on Google right, and they had, there was birth and all these things, and I think it really took off when they went to another dimension of training data and compute right.

Speaker 2:

So I think actually that's the main, the main thing that unlocked right, like they always said, big day. Actually, this is something. Even when I was studying, studying right, I keep saying big data, big data. But now the big data that we referred back then is another dimension than the big data we talk when we talk about LLMs and ChagPPT and all these things right, and I think that's the main challenge that it solved in a lot of ways. And why am I bringing this up now? Because I'm not a researcher and I think we will have some guests that are more in the research space, so I think it would be an interesting thing to ask them. And one thing that made some noise is Mamba, mamba, mamba. Is it the snake? I guess it is right. I think of every time I hear Mamba, I think of Kobe Bryant, because you know they had the Mamba mentality, you know.

Speaker 1:

Ever heard of that.

Speaker 2:

No, no, If Tim was here, ah would kill it.

Speaker 1:

What is Mamba.

Speaker 2:

What is Mamba? So maybe just to put a disclaimer. I read a bit about it, but I haven't looked too much into it. I think it's something new. It's a new architecture. Basically it's like with state space and whatnot, maybe for the listeners.

Speaker 1:

We're talking about transformers or a form of architecture that is today more or less considered the state of the art. Things like Chet, gpt, lama use transformer architecture.

Speaker 2:

Yeah, GPT in fact stands for general pre-trained transformers.

Speaker 1:

I want to say so yeah, and Mamba is a different type of architecture.

Speaker 2:

By the way, on the article that we also link in the show notes, it says here that in December of 2017, attention on these amounts. So even before that, it stands corrected yes, yes, yes, yes.

Speaker 2:

So it's a new architecture, right? So there was the attention mechanism. That really was the breakthrough for LLMs, and now this is the new contender. Let's say it's exciting because it's very new, right? So before we had transformers, transformers was more in NLP. Now it's in computer vision as well. Nlp is natural language processing, natural languages like spoken languages, like English, dutch, french, portuguese, as opposed to Python, right? So those are programming languages and natural languages. So that's where the NLP comes from. And there were some variations of transformers, like how they work, the encoder, the coder, the coder architectures, et cetera. But the general idea was kind of the same and this Mamba seems to be something different. Again, I'm not an expert, right, but I think, in terms of efficiency and context length I think that's the main thing that I've seen highlighted in some articles that JGPT struggles with long sequences. So if you have very long pieces of text, it struggles and Mamba should outperform that.

Speaker 1:

That's a bit the. This is still very new. Mama comes from a paper that was published in December, I think, beginning of December, summer. There is also a model that you can Not sure if it's actually the model is public, but the benchmark result is also public. There is and it's presented. It has been accepted as a paper on a big conference ICLR in March I think and MAMA is a state space model, so not a transformer architecture, but a state space model, which is not necessarily super new. I think it was being used in robotics et cetera already. But it is new in a sense that MAMA applies this to the natural language domain.

Speaker 2:

Yeah, I think. Also, I saw something on the article as well. Well, I don't want to speculate too much, right, but the idea is that, similar to what recurrent neural networks were doing in the LSTM, came the long short-term memory that would kind of keep. It would be good at knowing what to keep and what not to keep in terms of information. And then, I guess, the SSM, the state space model. It would be good for kind of remembering, quote unquote, what you should keep from one part of tech to the other. It's kind of, again, very high level.

Speaker 1:

Yeah, I must say that I also have a hard time trying to visualize the exact difference and the exact architecture and how it works.

Speaker 1:

I think it would be interesting to have a guest on that is a bit more in-depth on this.

Speaker 1:

I want to understand one of the major differences that with a transformer architecture, when you pass on text, when the text gets chinked in tokens, which is more or less a few characters together as a token, then if you have a long text like the, a transformer model basically has to make connections between all these different tokens that are so on the pass to the last one.

Speaker 1:

So you get this very complex representation. But it also means that the longer the input becomes, the longer the context becomes, the more complex your model becomes to hold all these connections. And with the state space model and I'm probably very much butchering and simplifying a lot of things here is that your model has a certain space. You pass on a new token and your space changes slightly based on the token and you more or less follow the token and you don't have this. It's more of a linear effect when you process context which makes that it is easier to. Well, that is a claim that is more performant on context in terms of the actual performance, in terms of the more NLP the domain specific, but also in terms of it's faster in inferencing.

Speaker 2:

So maybe just to make sure I understand what you were saying. So, as like in transformers, usually you pass the whole sequence at once.

Speaker 1:

You pass depends a bit with the exact type of implementation.

Speaker 2:

But then with the state space is really like almost like again going back to a window kind of approach. You pass in and one token at a time, you change the state gradually and then I think, the main difference, but again, we should have someone that knows more about this.

Speaker 1:

The main thing is that you have a certain state and you make adjustments to the state based on the last thing that you saw. Well, with transformer architecture, you need to make sure that there are associations between everything that you saw in the past and the last token, and that becomes more complex as you add new tokens versus a state that changes every time.

Speaker 2:

Yeah, basically you need to increase the thing that you need to intake.

Speaker 1:

And there is a lot of a bit of high preface peripheral in the community on this, Because the performance looks very good on long context and also when you compare it to smaller or similar sizes of transformer models, it seems to outperform them, but it is still very, very new yeah.

Speaker 2:

So let's see what it. Yeah, the fact that I mean to me it's good sign, right, because again December 2017, so 2018, it's been in 2024 now, so six years, kind of right. So I think, to see something new, I think it's a bit, it's refreshing.

Speaker 1:

Let's see. Let's see. It's similar last year. Rwkv, I think it's called RWKV. The language model is an RNN based language model. It also created some hype. A bit similar like this is being that the hype is currently being created. I have the feeling it didn't truly take off, but maybe I'm not really up to speed on those things. So let's see. Let's see, maybe ZGPT6 will be Mamba based. Yeah, I saw that, but actually, if GPT means stands for what was it?

Speaker 2:

General pre-trained transformer.

Speaker 1:

So it will be.

Speaker 2:

GPSSM Chat.

Speaker 1:

GPSSM and I know.

Speaker 2:

Yeah, but the thing we also mentioned, like on the research. They compared that with similarly smaller size models, which also to me make like I'm wondering as well if the GPPT and the transformer models they're also bad for research in the sense that it's hard to know, like just because the model does better with smaller sizes and smaller data, that doesn't mean that it will keep like it would be better when we get to the size of the GPPT.

Speaker 1:

Yeah, I also find that very difficult.

Speaker 2:

Right, like you see, a lot of research is conducted in universities. Do they all have access to the same resources? They do, because to me, chat GPPT took off. I mean again, chat GPPT is one year old and now it has all the hype, or a bit more than one year, but, but transformers, the actual architecture has been there for so long right so why did it take off?

Speaker 2:

It's because we had someone that said okay, I'm going to put millions on this, now let's just push the limits and see where we land Right. And I'm wondering if now, with research to really see if the SSMs like either, they're going to be much better with less data, because if they need that much compute it will take some time. I guess right To have someone that will put they will bet on this again.

Speaker 1:

Well, I think, if there is any value in it, the big players will lose it.

Speaker 2:

Yeah, that's true, that's what I think. I think so too, but I think it also restricts it a bit, right, because yeah, I don't know, I just feel like back, I don't know. When we're talking about neural networks, the CNNs and the RNNs and the LSTMs, it was much easier for people to kind of compare what's still at the yard, because everyone kind of had access to the same compute kind of issue.

Speaker 2:

It was easier to spin up a big GPU on the cloud and say, oh yeah, look comparison. But now still at the yard is like trained on the whole internet.

Speaker 1:

Yeah, and I also find it personally very hard to interpret these metrics Like you had. Like every other week, a new model comes out and there's like, oh, we outperformed X on that domain with these and these numbers. And then wow, and most of the time when I actually tried like, oh, yeah, let me just go back to JGP4. It's still better.

Speaker 2:

Yeah, I think. Yeah, I mean, I have a friend that he was doing a PhD, but he did say as well that they're very focused on publishing papers and like what's still at the yard and what is this and what is that, and sometimes that.

Speaker 1:

But it's also like it's hard to express in a number right Like the performance. Because you expect like it's natural language and you expect something Like it's not easy to put that in.

Speaker 2:

It's an approximation of what you want to see, subjective, right. It's like it's more of a feeling yeah, I see what you're saying. I see what you're saying, but let's see, let's see, let's see. But maybe we can switch topics.

Speaker 1:

Okay, let's switch. You're winking to me, but I have the feeling that this is a segue.

Speaker 2:

Yes, it is a segue.

Speaker 1:

You always do these in the segue, so naturally, I know right, I'm so smooth, you use Git part. I use Git.

Speaker 2:

Maybe what is Git for someone that doesn't know what Git is.

Speaker 1:

Git is a code versioning system that allows you to oh, you're putting me a bit on the spot about this definition that allows you to version your code. What does that mean? I've written some code saving it. Tomorrow I work on this again save these changes again and that's over time. You can see in the history what changed when and also by who, and that you can also reverse these changes. And you can build on all of this by forking and going to branches and making sure that your development experience gets optimized and all of these things.

Speaker 2:

Indeed, is that a good definition. I think so, I think so, I think so, I think so.

Speaker 1:

And Git is A implementation.

Speaker 2:

Yeah, yeah, yeah, coming from From Linux, right Linux store was the same there from Linux, linux itself, yeah, yeah yeah, actually, I think you call it like the stupid, sorry you call it something you talk about. Linux. No, no, he called Git. It was like the stupid something tracker Okay okay, Git, stupid content tracker, Okay okay, I'm not making this. We almost got cancelled there.

Speaker 1:

Yeah, almost I'm not making this, I'm not making this.

Speaker 2:

I'm not making this.

Speaker 1:

I'm not making this, I'm not making this, I'm not making this, I'm almost.

Speaker 2:

I've never made this, not like that.

Speaker 1:

So this is his nickname for.

Speaker 2:

Yeah, I think that's what he called. It was like the stupid content tracker, but indeed. So I think actually the first time when I was in university and I was like learning about Git, I came across the videos and the first comparison was when you build your Word document, your own computer, and then you've built a report and then you ask for feedback and it's like, okay, now it's report, final or match set, thesis, final, final. And then you read it again like no, no, master thesis, final, final, and then you have all these things, right. So I think git is a way that you can kind of keep track of all these versions and also it's very popular for software development, right. So there's like these things is branches, right, which are different versions of your code that can live in parallel to each other and you can also merge. And talking about branches, so git is a cli command line interface tool and you can actually switch between branches, right, how do you switch between branches?

Speaker 1:

Maybe one step, Maybe for listeners. What is a branch?

Speaker 2:

Yeah, so a branch is like if you think of a project as a tree, right, like you're going to have different branches, meaning like the, the version, the, let's say, the original version of your code will take two different paths.

Speaker 1:

Yeah, Maybe two people working on it and they can merge later. Yeah, exactly.

Speaker 2:

So, for example, if we have a code, we have something, me and you, bart, we're working on something. I don't want to be seeing the changes that you're making, so I can have my own branch, so I only have my changes, and then we can actually merge them later. We can make sure that my changes and your changes are both included in the final product. So it's something that's nice for us to.

Speaker 1:

So, to come back to your question, how do I switch branches? I typically do a type hit, check out my branch name. Or when I create a new branch, I do hit check out dash B, my new branch branch name.

Speaker 2:

Nice, I do that too, but apparently there's a new command called get switch oh wow, yes which is made exactly for that.

Speaker 1:

Isn't check out that made exactly for that Check out, I mean so again to me.

Speaker 2:

that's what I also wanted to bring to you. Do you think this is something that was needed or not? Because I think, from what I read, the motivation behind creating this command is that check out the too much. So, for example, let's imagine you have changes that you don't want to include. You want to discard these changes on these files. Okay, how do you do this? So tell me again, you want to discard changes that you have on your Git project.

Speaker 1:

My cutting corners is a hit reset.

Speaker 2:

Okay, that's that hard. But so I guess they were saying like check out is too powerful of a command. I guess it does too many things, because you can check out not just branches, you can check out files.

Speaker 1:

Yeah, you can refer to all that.

Speaker 2:

So then the idea is that now there is a Git switch with and a Git restore, so they basically broke it down into two things.

Speaker 1:

And it restores that branch.

Speaker 2:

I think. So I think it restores. Maybe it's like so to be a bit more specific, and if you don't know Git, maybe it's a bit much, but like instead of committing, so instead of saving, quote, unquote these changes, you can check out the changes right. So you go back and now you have the restore to do that. So it's like restore is just to not commit stuff, so it's like get check out, dash, dash, whatever, I think.

Speaker 1:

And are you going to use it?

Speaker 2:

I'm not sure right, because I also have aliases for this. So instead of typing Git status, I usually put GST, because actually comes with all my Z shell right, there's a whole bunch of aliases. So indeed, get checkout is actually aliased to GCO and get checkout branch is GCP. So I feel like now it's so much muscle memory that I don't know if it's something that I'll include. But also I was wondering if, because when I was learning Git I don't remember thinking that this was so weird, especially because Git a lot of the times gives you ah, if you want to commit this, run this command. If you want to ignore this, run this. So a lot of times it's like copy and paste stuff like git, checkout, dash, dash, staged, whatever you know, you don't have to remember this, almost. Or when you're pushing a new branch to the remote Right, a lot of times it's like you you get, no, you get commit.

Speaker 1:

You basically use like two or three actions and all the rest like basically means I'll fucking mess something up and I do need to Google it to fix it.

Speaker 2:

Exactly Right, you can change it. Now you can change it. But uh, so I don't know. Like when I was learning Git, there was never something that I remember confused me so much because the, the tool, will give me suggestions, you know like, oh, if you want to save them, do this, if you want to do this, do that. But now you're sounding a bit older. This is like oh, this is good enough what I'm used to. That's what I was, that's that's kind of my second thought.

Speaker 1:

Let's come back to this in a few months. I'm going to ask you again are you using a checkout or switch?

Speaker 2:

But I think I need to actively try to do it Right.

Speaker 1:

How I typically do these things. When there are new, new things in the language or new telling things in a tool that I use a lot, typically I think, yeah, don't use it. Never, never, never was looking for it, so I'm not going to, not not going to try it. It's like it's because it's not not part of your regular routine. And then at some point I'm going to, I'm thinking to myself you're getting old, you need to force yourself to do these things. And then you do it three times and then becomes a new normal.

Speaker 2:

Yeah, maybe, but I think I mean sometimes you are missing out, right, like, for example, chat PT. There are people that say, oh, I never needed my ID, but like you are missing out in a way, right, it's a huge product.

Speaker 1:

Yeah, I think you need to, to, to to remain knowledgeable or relevant, Like you need to actively keep testing stuff and maybe this is a bit minimal for this. This example is a bit minimal.

Speaker 2:

But I think it's even like I don't know, python, packaging, the. That was like the requirementstxt and the setuppy Now. Then poetry came along with my part and then I was like, oh, I'm going to try this other thing. And then people are like, oh, why would you try poetry works? I was like, well, I could have said the same thing when I was using requirementstxt and setuppy. But, like, if you don't know what the alternative is, you're never going to be able to compare and know what you're missing out. But that's the thing. Yeah, right, same thing with chat, pt, right. If you say, oh, I'm okay with my quote, unquote, dumb, autocomplete, but if you're using co-pilot, maybe if you try once again, I'm like, okay, this is what I'm missing out and I think maybe you should give it a try, it's true, for knowing a lot of things, right, I was on atom in 2018.

Speaker 1:

And I said the same thing about VS code. And now, why do I need a new ID? But you tried out and then yeah, and then you saw some white hairs and you're like I'm sound old. But if you ever understand you correctly, you learned to hit in university. You never used anything else, you never used another.

Speaker 2:

I think the Curio, I think, is one of them.

Speaker 1:

Mercurial is one of them. Never used it.

Speaker 2:

Have you.

Speaker 1:

Well before he had to use the SVN and then before that mercurial actually.

Speaker 2:

Yeah, and is there a big difference between them? Did you have a favorite or was it just like okay, people using Git. It's kind of the same old, different flavors of the same thing.

Speaker 1:

If the feeling I can't compare it well, because when I especially definitely when I was using Mercurial and to also to a big expense stand, when I was using SVN, it was not in a big team and, like my working, working on a code base with a team experience mainly centers around it.

Speaker 2:

Yeah, it's a hard time comparing the two yeah, I'm wondering also if we talked about versioning. If you go to Google Drive, there is a versioning as well. Right, and I said I was like I'm gonna have the feeling you're going to say controversial stuff. No.

Speaker 1:

I don't need you to.

Speaker 2:

I want to see someone that tries to do something like this, like a versioning system for projects but not Git, like something more user friendly or something that looks a bit more like the versioning we see in Google Docs, or something I think that could be interesting to take a look. But, yeah, not sure. But yeah, yeah, and I think a lot of these tools are to increase our productivity.

Speaker 1:

Yeah right. You're winking again. It's probably segue again.

Speaker 2:

If you're not watching the live stream, you can see parts of space when he realizes what. I mean Productivity killers. So on the other hand of on the other side of productivity boost, we have some productivity killers who know all about it. Yeah, too well, anything you want to get off your chest here.

Speaker 1:

No, no, no, I'm gonna wait for a.

Speaker 2:

So these are just some. Maybe a lighter, less technical topic here, but something came up after the New Year's Biggest productivity killers in the engineering industry. So the author here, gregor, I want to say and I'm not going to try his last name Just kind of listed the top three productivity killers in his point of view. Okay, and actually there's an additional one, but basically the three of them are perfectionism, procrastination and context switching. So the perfectionism, I agree, and I think I have this like this links to other things I think in general about my life, right, so I the more and more like I'm working on things and working on projects, I find that it's better to just kind of get something out the door that is not perfect, is not feature complete, and then it's like you can take a pause, right, like it's fine, you know you can just kind of do it and it's there. And then maybe once you get a hang of it or you want to be motivated to get to it, then you can take another step, right, and usually when I'm tackling projects like this, I go way further than I would have if I had scoped. It's super like oh, I have to have this, this, this and this. Until this is not finished, I'm not going to publish, I'm not going to do this, I'm not going to tell anyone. So I guess this is not perfectionism. This is more like feature completion, I guess, and scoping, but in the sense that like there's the ideal and then there's the practical Same thing with the spot can actually right, like we had a lot of ideas of things that we could do this and this, but it's like it doesn't need to be perfect.

Speaker 2:

Let's make something that is easy for us to get out the door, to do it every week. You know, we didn't start with the live stream. Now we have the live stream, now we have this, then we have that, now we have we're going to have external guests soon. So I think it's a. I mean, I'm trying to learn Romanian, my partner's Romanian. I was going to do an English because it's something easy. It's not the best, it's not perfect, right? Maybe if I were to buy a book or do a course or something, that would be more efficient. But I rather, I think I'm trying to tackle things more as do a little bit every day something that you can, you know, be consistent with.

Speaker 1:

But they've understand correctly Perfectionism for you in your engineering work is not a thing.

Speaker 2:

I feel like you're saying some trash on me there. Well, I think perfectionism it's like a fluffy word Because I think what? What it means for me? Like, if you look at a, we built something right. Is this perfect? You may say yes, I may say no. So it's a bit controversial, so I feel like nothing is really perfect.

Speaker 2:

It's almost like you're writing a book you can always revise it, you can always make it better, right? So, first, nothing is perfect, but I think it's more. To me, it's more about being consistent and being the most efficient or not efficient, but like the most how can I say it? The checks all the boxes, I guess. So last week, for example, I made a comment about AI coaches for running and I made a comment that when I work out, I just go and I just try to do something. That is easy and it's definitely not the most efficient, but it's something that I can keep up doing and I think, in general, it's like being efficient or taking the most out of your effort is definitely important, but to me, in my point of view, the most important thing is being consistent.

Speaker 2:

So the same thing with diet, the same thing with all these things, right? So if you can be consistent, so you can do something every day, you can train every day and you can still try to get more out of that effort that you're putting, I think that's perfect. But if you for you to get more out of that one workout, you need to you end up being less consistent, and I don't think that's a good trade off, Right? So, and I think we, the conversations a lot of the times, are about what's more efficient, what's better, what's faster, what's shinier, but I think sometimes the we miss a bit like what can you do, what can you take up? You know, let's start there. Can you walk around the block? Just walk around the block, and maybe one day you're going to want to go for a run.

Speaker 1:

I think what you miss a little bit when it comes specifically to software development, there's also there's also team aspect to this. I think a lot of people also like I'm going to commit to this branch, this branch and everybody will see it Like this is also something that you need to get used to when you enter the field. And if you are like this, these things out up, like it's not just typically, you're not just working in the solo on a project right.

Speaker 2:

No, that is true.

Speaker 1:

And if I look at myself, not necessarily for the team approach, but when it comes to perfectionism when writing code or when building solutions, software solutions I'm a bit. I have two extremes. I can be very perfectionist when it comes to when I don't have anything yet and I need to set up a system with a lot of components, I will Google, like everything. We'll read everything, We'll look at how many hit them stars does this have? We'll look at how long it existed. We'll also look at the shininess factor We'll see like. Based on past experience, I'm very perfection, Like I, often a bit too much, and then on the other side, when I start, maybe I'm a bit too much focused on let's get just something that works.

Speaker 2:

Interesting.

Speaker 1:

Which I think let's get something that works is very good to very quickly get something that works. It's not a perfectionist mindset at all, but not necessarily a good idea for the long run. From the moment that it skills, you have a team working on.

Speaker 2:

Yeah, actually interesting you mentioned this. This week I had a conversation with a colleague that he was talking about. Well, he had a conversation with someone else but was talking about one way doors and two way doors, something that I think came from like Jeff Bezos or something, but basically there's some decisions that are one way doors.

Speaker 2:

And there's some decisions that are two way doors and the decisions that are two way doors basically can go back and forth. You shouldn't sweat so much. You know, just do it, you go back, you go back. But it feels like that's what you're doing, maybe unconsciously, because if you choose a technology that's a one way door, Because for you to go back you have to refactor, you have to do a lot of stuff.

Speaker 1:

A lot of stuff builds around that Interesting thing to think about like this.

Speaker 2:

Yeah, so it's like you think you overthink, like overthink, but like you put a lot of time and effort in thinking the technology because a lot of the stuff would depend on that.

Speaker 1:

Never said that I also make good choices, but it's neat.

Speaker 2:

No, but I think it actually matches pretty nicely right, because after that it's just getting something done and making the code look nicer, whatever. That's a two way door.

Speaker 1:

It's a good way to think about components and architecture. I think, indeed Like to me think about a database. You need a database for whatever. Like you can go with what is trusted and what you know will work and postgres. Or you can get the latest shiny database that there is a lot of hype and might be very cool to use. Yeah and that is. But if you think about it like the two way, one way door.

Speaker 2:

Yeah, yeah, yeah indeed.

Speaker 1:

Like this is a database. Like going back is gonna be super hard a few months from now, right, like changing the data storage is gonna be a big, big big thing. Like versus, I'm gonna let's say it's Python, I'm gonna write my Python code and this class I'm gonna put in that file, that class in that file. I mean that's two way door.

Speaker 2:

Like that you can change easily, right, yeah, yeah, yeah, yeah, it's an interesting way to, but it feels like that's what you're doing in a way, because maybe you are very pragmatic in just getting something out of the door because you know you can change it very easily afterwards.

Speaker 1:

Yeah, when you say it like that, it looks like a very conscious, well thought out approach.

Speaker 2:

Thank you, maybe it's just like that comes to you, you know, but.

Speaker 1:

And what were the other ones? The productivity killer in software engineering.

Speaker 2:

Procrastination, which I think is a bit self-explanatory. I do think the Well, I don't know if he's talking more in the personal project sense or more in the group sense. I think if you're in a team, there is an external pressure let's say so, I think, which is probably good.

Speaker 1:

Yeah, I think. I think people tend to prioritize what they like, the best right, which is not necessarily the most valuable.

Speaker 2:

Yeah, I think. Sometimes to me it's like you have an idea, you're motivated, and then you hit a wall and then you kind of procrastinate after that. So sometimes for me, even I go to I don't know, I book a public YouTube live stream on the, you know, because I have the deadline, I will be forced to do it. Yeah, I know it won't be perfect, though that's the thing. But like if I was always waiting for to make the perfect presentation, I wouldn't do it. It's again one of those things you know.

Speaker 2:

It's like just do something, get it out the door and then if you do more presentation, you're gonna get better at it. You know you're gonna have more stuff, you're gonna be able to share more ideas, right. So it kind of goes back a bit to the first one. I know it kind of goes back. And the last one is context switching. I think that was pretty popular among developers.

Speaker 1:

Context. I think that of the ones you mentioned, I think to me that is me personally. That is my biggest productivity killer.

Speaker 2:

Yeah, but I think that one is the one I'm most conscious of. I think it's-.

Speaker 1:

Oh, that's a good one, that is true.

Speaker 2:

Like the other ones, it's sneak up on me, like, for example that's a fair point yeah, especially the procrastination, the perfectionism one, because I was even thinking, for I don't know, in the AI, a lot of times you work with notebooks and a lot of times that's for exploration and I have the. I wanna make the codes in notebook look nice, but I'm like, or I wanna make sure that the cells are all nice and this and this and there's a logical progress, but I'm like man, just like. This is exploration, you're just prototyping. This is not the time, you know, and I have to like I wanna put linters on a notebook, you know like I wanna run black on JupyterLab and I'm like man this is because you're the author of a data books.

Speaker 1:

I am author of data books, Exactly what this does, right.

Speaker 2:

Well, not really, not that in particular you can do-.

Speaker 1:

Well, that you could also do lending. No. No there are other tools for that. The other tools for that, actually black black.

Speaker 2:

There is already. They already format Jupyter notebooks.

Speaker 2:

All right so. But yeah, but like things like that, I need to, like it sneaks up on me and I need to. I know that it's a problem, but I don't notice how much a problem it is Contact switching. It's easy Like, oh, you have this meeting, then you have this meeting, then you have this meeting, and like there are all three different things and then you have some focus time but then you have another meeting and it's like, well, yeah, I had to switch contacts four times and it takes me a while. Sometimes you know it's more in my face. So I think I address contact switching more than I address the other two. Well, procrastination I'm not sure if I really so much, but perfectionism for sure. What tips do you have for avoiding contact switching?

Speaker 1:

Puh, I don't Bad as it is myself Okay next topic. So I think that I think it's also a personal thing.

Speaker 2:

Yeah.

Speaker 1:

So I think I'm personally quite good at contact switching, okay, but there's like a threshold, I think, like how many contact switches do you do during a work day? And that threshold is for everybody different.

Speaker 2:

Hmm, okay.

Speaker 1:

And I think that threshold is something that you can manage, like, are you being interrupted by meetings, are you interrupted by calls, are you interrupted by things popping up on your laptop, like, and these are tangible things that you can manage, but maybe are not easy, right, because there's an external pressure to this as well, maybe external expectations.

Speaker 2:

What are you very all over the place kid when you were growing up?

Speaker 1:

Some say I was, I never looked at it that way, everyone except you. I never looked at it this way. My mom, my dad, my brother, my Some say I was.

Speaker 2:

Because I also I think I cope better with doing a lot of things than other people, but I also think like when I was growing up, that was something that was a bit frowned upon. You know like I needed to listen to music to do homework, I needed to do this. You know like I needed the stimulus. Yeah, yeah, yeah, yeah.

Speaker 2:

I wouldn't say but I think, or I had a hard time in the teaching class, right, all these things, and I think now I wonder. So I'm not, I don't have any credentials to talk about it, but I wonder if this gives me an easier time, that this trait makes it easy for me to switch context and not feel like I'm out of energy, you know, because I'm used to be like do this actually. In fact, if I'm just doing the one thing for too long, I get bored of it, like even data books, like I wrote it, but it's been like a long time since I haven't touched on it.

Speaker 1:

Still works, though go and use it. Well, I've been a bit the same, like when we talk about, for example, open source, like I maintain it when I use it and then, for the moment, I don't use it anymore.

Speaker 2:

Like it's like no world, my radar anymore. Yeah, that is like I could say yeah, yeah, yeah, yeah that is some. But yeah, and we did, do we have? Yeah, I think we have time. We talked about databases as well, how choosing a database is you know, are you going to use the latest shiny thing or Postgres? And then here I see, like PG Analytics, yes, speeding up Postgres analytical queries by 94X.

Speaker 1:

Ooh, yeah, like this is a hype title, right.

Speaker 2:

Yeah, yeah, clickbait title. Yeah, clickbait Okay.

Speaker 1:

This is something that popped up in my feet, I think yesterday. Actually, it is a, I assume, company, which is called ParadeDB, which has created a plugin for Postgres, the number one database. Yes, actually, maybe Like actually, I think, the number one open source database and the only one that is growing over time.

Speaker 2:

Yeah Well, I wouldn't be surprised. I have maybe a bit of a side note. Postgres, I mean also I heard it somewhere as well not first hand information here, that was SQL and then that was the no SQL thing. Now Postgres has like a JSON data type so you can actually add some unstructured stuff there, JSON JSON Kind of included stuff there.

Speaker 2:

And then there's the vector database and then people are like I heard someone arguing that it's kind of the same thing with no SQL. It's like we have vector databases that are just vector databases, but then you have Postgres PG vector.

Speaker 1:

I think yeah, there's a plugin for Postgres.

Speaker 2:

Indeed, and it's like people go a bit out but they always come back. You know, it's like Postgres seems to kind of.

Speaker 1:

It's the MamaElephant.

Speaker 2:

It is Actually indeed. I saw it in the road here.

Speaker 1:

For people that know it, there's an elephant logo.

Speaker 2:

Who needs anything else than MamaElephant?

Speaker 1:

Yeah. So PG Analytics is an analytical plugin for Postgres, and I haven't tried it myself yet. It promises very high performance, a very big improvement versus the native analytical capabilities of Postgres database.

Speaker 2:

So when you say analytical and analytical plugin, what do you mean exactly?

Speaker 1:

Analytical is. When we go a bit back in time, databases were more or less the focus of database was transaction based I want to insert something, I want to update something, I want to delete. Something Like this was the basic use of a database. When we look at it today, where data has become a big thing and the analysis of data in a database has become very important and there's a different focus and it's typically very hard to do these things At the same performance. So being very good at transactional things and being very good analytical things, and the electrical things I guess are like grouping and aggregating the these type of joining across large, large tables Typically very hard to do these things.

Speaker 1:

Postgres is transaction focused, as always been from checks and folks as the history, but has plugins that are analytical focus. I think the best known is more time series focus is time skill DB and it's actually. It's also a postgres plugin and they actually compare it to a number of other analytical databases and it's, of course, much faster than postgres natively 95, 94x, but it's also let me just see that I'm not bullshitting you.

Speaker 1:

It is appreciated if you it is more or less on point with click house, which is very, very impressive. Right, it is faster in its own benchmark.

Speaker 2:

So that's what is click house. Why is it impressive?

Speaker 1:

Click house is very fast on the database.

Speaker 2:

But it is like a possible, because I see a parquet single. So it's just reading files, I guess.

Speaker 1:

Yeah, yeah, yeah, well, parquet files and it will probably bring stuff in memory cash stuff and it's also it's faster than elastic search and then a lot of other things, much faster than time skill DB. But again, I don't know the details of this benchmark. Yeah.

Speaker 2:

If.

Speaker 1:

I would be the company behind this plugin. I would also make sure that it looks very nice. So I don't know what this means in practice. Of course, what they do actually and like again, I didn't test it and then looked at the code myself, but they say that they introduce new type of table. It's called a Delta Lake table and it behaves a bit similar to a regular postgres table, but a regular postgres table is typically row oriented. This is column oriented. This column oriented builds on Apache Arrow. Because of that, can do much faster or analytical operations.

Speaker 2:

And I think Apache Arrow, if I remember correctly, is like a conventional how you should map your table in memory, so like, and do operations on this and do operations.

Speaker 1:

So I think that is very widely used by a lot of technologies, so it looks very interesting. Let's see what it gives in the future and this is the first time I hear about ParadeDB, which is the company behind this, and what you will probably in the near future offer a service, a managed service around postgres analytics, but it's cool to see all these things being built on postgres.

Speaker 2:

Yeah, true. Maybe one thing about the benchmarks in general, I have a I always, when I see a benchmark first, I'm always, I always have a let's call it a healthy skepticism, because even if you see, I don't know for data frames like pandas, polars, all these things right, they also have a whole bunch of data, a whole bunch of benchmarks, and it's always like, oh, if you have this much data and if you have the three group buys, two ware clauses and this, then this is the fastest. Of course, there's like the top three are kind of repeated, like they circle a bit right, so you can still get some insights, but it's like to say 94X, what kind of? What kind of operation are you doing?

Speaker 1:

It's to get me to click this operation.

Speaker 2:

Exactly, yeah, and it works. So here we are. Maybe also I see here white data fusion and I was trying to see because I think there's a rust thing for Apache. You know it's a.

Speaker 1:

I know it also ought to type yeah, it should have added it to the title.

Speaker 2:

Yeah right, I think they should.

Speaker 1:

Query engine in built in rust. It's actually like it's in the title, it's in the article which query engine they use. Let me see what are people interested in.

Speaker 2:

It's Apache data fusion for as a query engine, let's see Apache data fusion.

Speaker 1:

You want to see if the roast base right.

Speaker 2:

I think it is.

Speaker 1:

It is yeah, I found it. It is so. This is this. They should have added it to the title, right. Maybe, I think that you would have clicked, I would have clicked, I would have been to bring this, I'd click anything that is postgres, really, but if we would have, if they would have other rusts, you would have clicked as well.

Speaker 2:

You know that that meme that is like the two really strong, and there's one like black guy when white guy and they're like shaking hands. It's like me and you with this. You know it's like database rust and it's like yes, you know, we can be, friends, would you guys?

Speaker 1:

are you and that meme? No comment.

Speaker 2:

Okay, all right, I think those are all the topics for today. We will not have the quote or not quote today.

Speaker 1:

No, we need to rethink it a little bit.

Speaker 2:

Yes, because now we're going to have a.

Speaker 1:

So we sometimes have a lack effect. So we did the quote or not quote, where we have three quotes, one of them being real, two of them being fake. Yes, set CPT generated or JANI generated. The challenge is it, when we had a guest and day one, that we need to ask the guests the next time to provide them. So we need to think a little bit how we do this and that we make it a session independent. Yeah, so again.

Speaker 2:

It's like the whole iterative thing. You know, I feel like it was something easy to start with. Now we can you know, maybe we can also.

Speaker 1:

We can broaden it up a bit. I think it would be nice, because we also do the streams now. Yeah, if we maybe also have the possibility of a visual effect. You do a JANI image. True, we leave it a bit up to the person that brings it.

Speaker 2:

Quote image. But we still need to cater to our listeners only.

Speaker 1:

That's true. That's true. We'll think about it, we'll go up or something good, maybe one last.

Speaker 2:

maybe you can give a vivid description of the images Like we can have some sound like a bit of ASMR.

Speaker 1:

Like you can try it, like like I'll practice for next time. Maybe you can whisper a bit description.

Speaker 2:

And then we can ask this is a good test. I'll practice before the live stream.

Speaker 1:

And then when?

Speaker 2:

I'm I'm sorry, I don't think you can try.

Speaker 1:

I'm for the people, that sort of live. Okay For the listeners. Merillo is holding something. Merillo is holding a bottle of beer. Do we have any?

Speaker 2:

sounds here that are like give a nice vibe to the romantic vibe.

Speaker 1:

Yeah, okay, we need to think about this we're going to do this right next time. We're half asking it yes, next time.

Speaker 2:

Next time. So food for you to join us. Maybe one last thing my partner, Maria. She went to the UK and she kindly brought us oh, nice, nice. So just wanted to. So if you're watching, maria, see, I brought it to the podcast to offer some to Bart. Do you like sweets? No, I like sweets.

Speaker 1:

This chocolate I've already had chocolate Right. But it's still closed.

Speaker 2:

Yes, closed. Maybe it should have opened because it's going to make a lot of noise, but we'll have some. We'll have it after. Yeah, by the way.

Speaker 1:

Thank you for no, we did a shout out to Maria Shout out.

Speaker 2:

Thank you, maria. Yeah, so thank you for following the live stream. Next week we have a guest, external guest, who do we have, I think, the money, yes, yes, the money and the money.

Speaker 1:

So looking forward, the money and how.

Speaker 2:

His last name. Yeah, yeah, you're going to try that.

Speaker 1:

I'm going to try it. That's a bit, it's a bit, it's a bit dangerous. Yeah, yeah, no, no, we're going to ask him Next week. We're going to ask him. You're going to be cancelled, Bart I just we're going to ask him yeah, no one to give it cancelled Next week. We're going to ask he is an ML Ops lead at Euroclear. Very interesting. Yes, he's also done a ton of tutoring on machine learning. He holds a bunch of data camp courses, exactly.

Speaker 2:

So I'm really excited to have him here. Yeah, yeah, he used to be a colleague, so really nice guy.

Speaker 1:

Yeah.

Speaker 2:

Really happy that he's joining us. Definitely Cool. All right, then I'll see you then. Thanks for listening.

Speaker 1:

Thanks for watching and see you next time.

Speaker 2:

Oh, actually I put the wrong one next week.

Meta's Copyright and Fair Use
Understanding the Mamba Architecture in NLP
Git Version Control and Transformer Architecture
Productivity Killers and Git Commands
Software Engineering
Switching Contexts and Analyzing Postgres Performance