Jake:Welcome to another episode of... Sorry, I went in too strong there, and I felt like I couldn't sustain it, so I'm backing out.
Jake:Let's maybe start softer, and we can build up from there.
Surma:Starts off to, like, hello, hi. And me, Surma.
Jake:Hello. Hello. Welcome to another edition of OTMT, with me, Jake Archibald.
Jake:Excellent. Okay.
Surma:Welcome to the Most Mindful podcast in web development.
Jake:Oh, it's a little bit NPR now, isn't it? We need to pick it up from here.
Jake:I should say, thank you to Shopify for letting us do this, and hosting it, and that kind of stuff. Thank you very much.
Surma:Yeah.
Surma:Thank you, appreciate it.
Jake:So, I was in Newcastle recently, visiting some friends, and I did something I've been wanting to do for a while.
Jake:I did the Chris Jericho Triangle. Have you heard of this? Are you familiar with the Chris...?
Surma:No, I'm assuming it's not a musical instrument?
Jake:No. Chris Jericho is a Canadian wrestler. He was in the WWF when I was a teenager, and then obviously WWE afterwards.
Surma:You say obviously. I don't even know what those letters stand for. But I'm guessing it's different wrestling leagues?
Jake:What was the WWF?
Jake:The World Wrestling Federation was the WWF, and then they had a bit of a dispute with the World Wildlife Fund, and I assume they lost.
Jake:No, there was a cage match, but the problem is, like, the WWF, as in the wildlife one, they brought tigers and bears.
Surma:Was this dispute real or was it just them acting?
Jake:And, you know, man, like, they were just not... there was so, so much blood. So much blood.
Surma:Many people died.
Jake:And now the Wrestling Federation are called WWE instead, and that was many, many years ago.
Surma:And the E stands for?
Jake:Entertainment. Come on.
Jake:How did you get into wrestling?
Surma:World Wrestling Entertainment. It kind of gives it away. I thought the whole point... I remember when I went to school, like, primary school.
Surma:I don't think it was particularly popular in Germany, wrestling overall.
Jake:How did you get into wrestling?
Surma:But it was like, it's all just acting. And other people were like, no, it's not. It's real. Look at them. They're bleeding.
Surma:And it's just like big, big mystery and conspiracy on the primary school playground.
Jake:At school, I remember being sort of made fun of for watching wrestling by a bunch of girls who were really into, like, boy bands.
Jake:Like, they were really into Westlife, and I was proud of my...
Surma:Well, clearly, people in glass houses.
Jake:Well, that was my response. I was like, well, what's the difference between a wrestling fan and a Westlife fan?
Jake:The wrestling fan knows it's fake. You know, I was quite pleased with myself.
Jake:But yeah, no, no, we knew it was just soap opera, you know?
Jake:Anyway, Chris Jericho Triangle.
Jake:This is, oh, I guess a good few years now ago, Chris Jericho visited Newcastle.
Jake:And he posted on Instagram, great day wandering around beautiful hashtag Newcastle getting ready for the sold out show.
Surma:Well, it's just three different sides of the same building.
Jake:And he posted three pictures of his, you know, wandering around beautiful Newcastle.
Jake:And so I, you know, I wanted to walk in his footsteps.
Jake:And I went to the same three places and took the same three photos of myself.
Jake:All three photos were within a 10 meter radius of each other.
Surma:Well, I mean, that's right up your street.
Jake:Not just that, outside the Wetherspoons.
Jake:Yeah, well, I don't know.
Surma:Did you have an honorary pint in each Weatherspoons?
Jake:Oh, no, no, one Wetherspoon.
Jake:Look, they're all in front of the same Wetherspoons.
Surma:Oh.
Jake:Like, they just, just...
Surma:Okay, let me do it again. Did you have three honorary pints in that Weatherspoons?
Jake:No, I did, I did have one though.
Jake:And how would you explain Wetherspoons to international listeners?
Surma:Like the simple way to put it is a pub chain. If you want to go deeper into it, it's owned by a bigot. And technically, you should not be giving them money.
Jake:Oh, yes.
Jake:I would say it's like the McDonald's of pubs.
Surma:Yeah, that's a good way to put it.
Jake:And in the same way that they are both kind of like figureheaded by a clown.
Jake:Yeah, Tim Spoon, who owns Wetherspoons is a bit of a wolf.
Jake:Yeah, a bit of a problem person.
Surma:That name. You really don't have much choice in life, I feel like.
Jake:Oh, that's not his real name.
Jake:I can't remember his real name, but he looks a little...
Jake:Well, and I'm stealing a joke here.
Jake:He looks a bit like a lino from the Thundercats.
Jake:It was if lino had a racist uncle, he looks like.
Jake:We'll put a link to a picture of, I think it's Tim Martin.
Jake:But I don't know, Tim Spoon.
Surma:I think I prefer that. He shall henceforth be known.
Surma:We've probably mentioned this before, but we have the inverse law of effort. The more effort you put into a blog post, a library or anything really, the less people are going to give a shit and understand what it's about.
Jake:Have you ever thought about blogging about things you've just learned?
Surma:And if you do a quick throwaway article, tweet, mini hack library, people go nuts on average. Less effort, more interest.
Jake:Yes, a thing of this has been happening to me recently
Jake:that I used to blog about things I had just learned.
Jake:And because I was excited, I've just learned this thing.
Jake:I'm excited about learning this thing.
Jake:I'm going to write it down.
Jake:And yeah, like very, very variable results on that.
Jake:Whereas recently I've done a couple of posts
Jake:where it's kind of just stuff I know from being around so long.
Jake:And it's like, oh, now that's a thing.
Jake:That's an unusual piece of entertainment
Jake:that I've just been around so long.
Jake:To sort of know how these things started.
Jake:And it kind of makes me feel very old.
Surma:That is something that many people do not have. The origins of weird attribute quirks.
Jake:But, you know, people have been reading them, which is nice.
Jake:Have you ever thought about age?
Jake:Age.
Surma:So I had this thing where I needed to scratch an itch. I've been working with a lot of WebAssembly stuff, mostly from C and Rust.
Surma:And, you know, in JavaScript, we have a couple of ways to figure out what a big JavaScript bundle consists of.
Jake:Oh, yes, you get those lovely little diagrams.
Jake:There's plug-ins for Webpack and Rollup.
Surma:Yeah, my go-to is usually Source Map Explorer, which is on NPM. And you just give it your source map. And then it gives you, I think it's called a tree map, which is a rectangle.
Jake:Yes.
Surma:And it partitions that rectangle into areas of proportional size to how much any given entity consumes.
Surma:So if you have 20% is your code, 80% is React, that rectangle is going to be divided into 80% for a rectangle called React and 20% for a rectangle called your code.
Jake:Yes.
Surma:And it does this by folder structure. So there will be a big rectangle for node modules, but that in itself is going to be subdivided.
Jake:Yes.
Surma:And it's quite a nice way to figure out why is this JavaScript bundle 1.2 megabytes.
Jake:Historically an amazing way to find out
Jake:where all of your hard drive space is going.
Jake:That's what these tools use.
Jake:And it works exactly the same way for a JavaScript bundle.
Surma:Oh, true.
Surma:I feel like Source Map Explorer hasn't been maintained super well. Like, over time, it seems that it's not... I wonder if it doesn't support the newer source map versions.
Surma:Because, like, there's more recent bundles, like where the Vite spits out just has, like, a big unknown area, but then other tools get it right.
Jake:Yes.
Surma:Anyway, what I needed is that, but for my WebAssembly binary, because it got bigger.
Jake:Hmm.
Surma:And I wanted to, you know, in Rust, I just wrote my code, but I also pulled in some crates, as libraries are called in Rust.
Surma:And I was like, well, I wonder if it's this or that library that is actually making this WebAssembly binary bigger.
Surma:And I went on a journey to try and figure out where can I get this data and can I somehow visualize it.
Jake:Yes.
Surma:And I want to tell you about it, Jake.
Surma:Well, actually, I also want to tell you about source maps, because that was a whole parallel journey.
Jake:Yes.
Surma:So, I guess, actually, now that I say all this, what I want to call this episode is debugging on the web, because both of these things are related to that.
Jake:Yes.
Surma:But, yeah, so that's what we're going to talk about.
Surma:If you take a step back, you know, we have JavaScript and it gets bundled, it gets minified, it gets all kinds of...
Jake:Yes.
Jake:I think your title is correct.
Jake:I just don't think it's going to do well on the socials.
Surma:Oh, that's true.
Jake:I think we need to think of, like,
Jake:I looked at what was in my WebAssembly binary
Surma:I could also... I have beef with source maps.
Jake:and I couldn't believe what I found.
Jake:Something like that, right?
Jake:Come on.
Surma:There you go.
Jake:Oh, yeah.
Jake:So do I.
Surma:Oh, maybe I should have graded with source maps.
Jake:No, I have gravy with beef.
Jake:But, yeah.
Surma:So, I don't know which way to start, but I think I'm going to start with source maps,
Jake:Maybe.
Surma:because I'm assuming our audience is, on average, slightly more familiar with their existence and what they're commonly used for.
Jake:Maybe.
Surma:And what I learned while doing the research for this is that source maps were kind of like an ad hoc invention by an engineer at Google working on Gmail.
Jake:Oh, okay.
Surma:Gmail, probably using GWT at the time or whatever, but the JavaScript in the end was compiled by Closure Compiler,
Surma:which, you know, for a long time was deemed to be the strongest JavaScript optimizer tool out there, until, I think, it was superseded.
Jake:And even then, like, you know,
Jake:sometimes I'll say on Twitter,
Surma:But there was a time where you would add Closure Compiler to your Gulp toolchain.
Jake:I'm importing a function from a file
Jake:and then another file is importing
Jake:another function from that file.
Surma:I think Closure Compiler is not to take away from all our bundlers and minifiers,
Jake:I wish bundlers were clever enough
Jake:to split that out.
Jake:You know, as in, you know,
Jake:each function goes separately
Jake:into the individual bundles.
Jake:And then someone will go,
Jake:um, actually, um,
Jake:Closure Compiler can do it.
Jake:So, yeah.
Jake:Closure Compiler can do it.
Jake:So I still think there's lots of stuff in there
Jake:that, you know.
Surma:but it's a proper compiler.
Jake:Mmm.
Surma:Like, it has a lot more of these techniques built in that we usually associate with compilers.
Surma:And at the time, I think it was all kinds of stuff.
Surma:You could add extra annotations, and it was really quite good.
Surma:But the problem was that at the time, and we're talking like 2009 here,
Surma:this engineer working on Gmail was trying to debug Gmail and the JavaScript that Closure Compiler had spat out,
Jake:Mmm.
Surma:which also, because Google at the time was already aware that every byte counts,
Surma:had minified variable names, all kinds of minification tricks.
Jake:Mmm.
Surma:Really hard to look at and debug in Firebug at the time.
Jake:Oh, wow, yeah.
Surma:So this guy just sat down and basically implemented a little thing into Closure
Surma:that would keep track of the code that's just being emitted.
Surma:What does it correspond to in the original input files?
Jake:Mmm.
Surma:And that's kind of what source maps are.
Surma:Like, it just gives you for every point in your generated code,
Jake:Mmm.
Surma:it has a way to tell you what is the position in the original source that corresponds to it.
Jake:Mmm.
Jake:Mmm.
Surma:And there's some extra bits involved.
Surma:But basically, he just did this to help himself debug.
Jake:Mmm.
Surma:And it kind of picked up steam.
Surma:And here we are with source maps now being the...
Jake:Mmm.
Jake:Mmm.
Surma:It's not a standard.
Surma:It has a spec-ish that's just a Google doc that has been living for over 13 years now
Surma:and has been worked on last year still.
Surma:But it is kind of a de facto standard of JavaScript tooling, isn't it?
Jake:Yes.
Jake:And I do think there is now an effort
Jake:to do it better, right,
Jake:to actually sort of take it into TC39.
Surma:Oh, is there?
Jake:Now, I'm kind of going from memory here,
Jake:but I remember that being some chatter
Jake:that the DevTools teams were wanting
Jake:because, you know what,
Jake:source maps aren't perfect.
Jake:The amount of projects where I just go into DevTools
Surma:Yeah.
Jake:and turn them off
Jake:because I feel like I'm being lied to
Jake:by DevTools, you know.
Jake:And it's like, look,
Jake:I just pressed, you know,
Jake:skip the next line
Jake:and something else happened.
Jake:You know what I mean?
Jake:That's not what just happened.
Jake:And I'm trying to debug this properly.
Jake:I want to debug what's actually running,
Jake:because I'm looking at...
Surma:Exactly.
Jake:So I think there is some appetite
Jake:to kind of revisit it
Jake:and solve the problems that exist
Surma:Yeah, I wanted to try to explain a little bit what is in a source map.
Jake:and actually standardize it properly.
Surma:On the one hand, because it helped me understand what it can and cannot do
Jake:Mmm.
Surma:and also explain why these weird behaviors sometimes happen.
Surma:So I guess the first thing we need to figure out is,
Surma:how does a browser, when you debug code, even know that a source map is available?
Surma:And usually, that's a special comment.
Jake:Yes.
Surma:I think typically at the end of the file,
Surma:which is like a slash slash hash bang,
Surma:and it says the source map URL equals,
Jake:And it could be inline as well, right?
Surma:and it's a URL to the actual source map file,
Jake:Like inline is an option.
Jake:Hmm.
Surma:which typically has the same file name with .map at the end.
Jake:Ah.
Jake:And it could be inline as well, right?
Jake:Like inline is an option.
Surma:That's what I was going to say.
Surma:It is a URL.
Jake:Hmm.
Surma:So a data URL is completely fine.
Jake:Ah.
Surma:And it's not uncommon.
Surma:And that's usually called an inline source map.
Surma:But whatever that file is,
Surma:whether it's an inline base64 encoded data URL
Surma:or the other file that you actually have to fetch,
Surma:that file is just a JSON file with a handful of properties, really.
Jake:Hmm.
Surma:So what is in there?
Surma:One thing I found is that Evan, who did ESBuild,
Jake:Hmm.
Surma:also wrote a source map visualizer.
Surma:This is not like a bundle analyzer to tell you
Surma:this source file is consuming this many bytes in a bundle,
Surma:but it actually visualizes the original source code
Surma:next to the minified source code.
Surma:So when you hover over a bit,
Surma:it shows an arrow to where it ended up on the other side,
Jake:Oh, nice.
Surma:and you can literally see which files have been mangled together
Surma:in different ways.
Jake:Oh, we'll link to that, because I want to play with that.
Surma:Yeah, absolutely.
Surma:That will, of course, be in the description.
Surma:There's also going to be a link to the spec
Jake:Oh.
Surma:and to a Hacker News comment from the inventor of source maps,
Surma:that Gmail engineer, which has a bit of details on the history.
Surma:But yeah, it is just a JSON object,
Surma:which really, it is quite straightforward, I want to say.
Surma:So it has the name of the file that this source map is for.
Surma:So the generated file links to the source map.
Surma:The source map links to the generated file.
Surma:So you just know how to map from one to the other.
Surma:You have a list of the original source file names
Surma:that were incorporated.
Jake:Hmm.
Surma:That's mostly because you don't want to repeat the string
Surma:multiple times later on, but just like, say,
Surma:this is source file one, source file two,
Surma:which makes the whole source map smaller,
Surma:even though they're already quite big.
Surma:There is sometimes a list of symbols.
Surma:This helps you unminify variable or function names.
Surma:So it tells you, ah, this function is called O in the output,
Surma:but the original name was delete all files on hard drive
Jake:Hmm.
Surma:or whatever, like it tells you that.
Surma:And then the big part is the mapping property,
Jake:Hmm.
Surma:which is a very long string.
Surma:And looking at it now, I actually am surprised it's a string
Jake:Hmm.
Surma:because it is really a two-dimensional array.
Surma:That's for some reason has been string encoded.
Surma:I'm not sure what the reasoning behind it is.
Surma:I wonder if it just compresses better. I don't know.
Surma:So basically it's a very long string.
Surma:And if you were to split it by semicolon,
Surma:each entry in that resulting array represents one line
Surma:in the output file.
Jake:OK.
Surma:And then each of these lines you can split by comma,
Surma:which gives you an array of segments in that line.
Jake:Hmm.
Surma:And segments are these little blobs in the output file
Surma:that can be mapped to a source file.
Surma:Now, I'm not going to try to explain the actual encoding,
Surma:but it's a bit like UTF-8.
Jake:Hmm.
Surma:It's called VLQ.
Surma:Basically, it's just up to five numbers,
Jake:Hmm.
Surma:which is like what file is it,
Surma:what line in the original source,
Jake:Hmm.
Surma:what column in the original source,
Surma:and potentially what original name for this variable.
Jake:Hmm.
Surma:But it really just maps a point to a point.
Surma:And this is one of the things I realized
Jake:Hmm.
Surma:where things get iffy with source maps
Surma:because it doesn't really tell you the range that it maps to.
Jake:Hmm.
Surma:So like in the generated code,
Surma:you obviously just have these list of segments
Surma:and they are going to be,
Surma:if you say one segment starts in column five
Surma:and the next segment starts in column eight,
Surma:you know the first segment is three characters long
Surma:or something.
Surma:But what it doesn't tell you of where it maps to,
Surma:how long that is.
Surma:And I think that's where the problem sometimes comes in
Surma:where you're technically setting a break point
Jake:Hmm.
Surma:in the middle of a segment,
Surma:but DevTools can only set the break point to the start
Surma:or jump to the start.
Surma:And especially with minifiers,
Surma:putting multiple statements into one line
Surma:and using commas instead of semicolons
Surma:and potentially skipping certain things,
Surma:I don't think the mapping it can express is powerful enough
Surma:to handle all modern minification techniques.
Surma:But I'm going to get a bit more into that
Jake:Oh, interesting.
Surma:because it's a very different story in WebAssembly
Surma:that's quite interesting.
Surma:So yeah, basically what the browsers do,
Jake:Hmm.
Surma:you know, stepping through the generated code,
Surma:the minified code is quite easy for a JavaScript engine.
Jake:Hmm.
Surma:And then whenever it wants to figure out
Jake:Hmm.
Surma:where it currently is,
Surma:it just basically does a binary search
Jake:Hmm.
Surma:through this mapping data
Surma:and figure out what is the correct corresponding source file.
Jake:Hmm.
Surma:And then it can either fetch the source file
Jake:Hmm.
Surma:or sometimes the contents of the source files
Jake:Hmm.
Surma:are also embedded in the source map itself,
Surma:which makes them even bigger.
Surma:But that allows you to debug in production as well.
Jake:I think that's the more common way of doing it, right?
Jake:Like that's, I think...
Surma:I think so, yeah.
Jake:Yeah.
Surma:I guess sometimes if you want to debug in production,
Surma:it might be good to strip it
Jake:Hmm.
Surma:because you don't necessarily want to leak your source code.
Surma:But if you have the source code locally available,
Surma:you can give it to DevTools
Surma:so it can resolve it while debugging
Surma:while people who are out there in the world
Jake:Hmm.
Surma:can't really deduce anything meaningful from it.
Jake:I was going to say, does that
Jake:actually matter
Jake:because, you know, your minification
Jake:shouldn't be a security step, but
Surma:Yes.
Jake:in the sort of
Jake:newer world of remix
Jake:and sort of
Jake:the React server components stuff
Jake:where you have, like,
Jake:your database code in the same
Jake:file as your client code
Jake:I guess that's where it would become
Jake:potentially difficult, right?
Surma:I would agree that minification,
Surma:all this stuff is not a security primitive.
Surma:But for certain things.
Surma:It's always the same.
Surma:On my screen, technically, I can download it.
Jake:Hmm.
Surma:But sometimes you don't want that,
Surma:while still allowing people to look at stuff on screen.
Jake:Right.
Surma:And so I think you just want to make it hard.
Surma:And so debugging could also make it harder
Jake:Yeah, okay.
Surma:for people to actually inject code in the right place
Surma:if they want to write a malicious browser extension.
Surma:So I think, yeah, it's not a good protection.
Surma:It's just a hurdle.
Surma:But it shouldn't be at all...
Surma:It should be the only thing you do
Surma:to try and keep your users safe.
Jake:Oh, but you definitely wouldn't want your server
Jake:code crossing that boundary, which technically
Surma:For sure.
Jake:in a build system
Surma:No, that's actually a good point.
Surma:So there, in that world,
Surma:it's probably more important to do that.
Surma:To not necessarily have your source maps
Surma:contain the full source code of all your source files.
Jake:you wouldn't want.
Surma:This is already the end of what source maps contain.
Surma:So it's really just a list of segments
Surma:that map to your original source.
Surma:And so what Source Map Explorer does,
Surma:and what I now did as well,
Surma:in the tool I wrote that I'm going to talk about in a little bit,
Surma:you basically just iterate over all the segments
Surma:in the generated code.
Surma:You figure out how long they are
Surma:by looking where does the next segment start.
Surma:And then map that,
Jake:Hmm.
Surma:like figure out what does the segment
Surma:correspond to my original source.
Surma:Actually, not really what,
Surma:but just like which file did the segment come from.
Jake:Mm-hmm.
Surma:And so I just built a dictionary
Surma:where I have counters for each source file.
Surma:And when I find a segment that is eight bytes long,
Surma:I'm going to increase the count of eight.
Surma:And when I go through the entire source map,
Surma:I now have a dictionary where for each file,
Surma:I know how many bytes it contributed in the bundle.
Surma:And now I can visualize it.
Surma:So with source maps,
Surma:it actually was quite easy to like build
Surma:almost like a clone of Source Map Explorer.
Jake:Hmm.
Jake:Nice.
Surma:However, I told the story historically a bit reversed
Surma:because originally, as I kind of,
Surma:I guess already hit it at the start,
Surma:is I needed this for WebAssembly.
Surma:And WebAssembly doesn't use source maps,
Surma:which is now very understandable to me
Surma:because like it's a compile target.
Surma:And most compilers already have a way to allow debugging.
Surma:And that is with DWARF,
Surma:which is I guess the de facto standard
Jake:And this is a cross
Surma:or an actual standard for debugging symbols
Surma:in native binaries.
Surma:Yeah, so the DWARF, the spec is very much explicitly
Jake:difference, like, it's not a rust
Jake:thing, it's not a...
Surma:language and architecture agnostic.
Surma:So it's not just for Intel processors or ARM processors.
Surma:It doesn't care.
Surma:And it doesn't care about the source language.
Surma:And it's been very longstanding.
Surma:It's now in the fifth version.
Surma:And basically that is what GDB or LLDB use
Jake:Hmm.
Surma:when you want to step through a binary.
Jake:Hmm.
Surma:And the whole design around DWARF
Surma:is that it's very quick to parse,
Surma:very quick to access or to find
Surma:and access the things you're looking for.
Surma:It is very capable and flexible.
Jake:Hmm.
Jake:So presumably this is working
Jake:very similar to source maps, so your binary
Jake:now has a bit of extra
Jake:metadata that is able to
Surma:Yeah.
Surma:Yeah.
Jake:map a certain
Jake:bit of binary to
Jake:essentially a sort of assembly-esque
Jake:kind of stuff at the end, like map
Jake:that to some source
Surma:There is a lot of similarity.
Jake:code.
Surma:DWARF actually does even more,
Surma:a lot more than source maps.
Surma:And it's actually not easy to wrap your head around it
Surma:because of how much it needs to be able to express.
Jake:Hmm.
Surma:And so like, I'm not an expert in DWARF.
Surma:I started skimming the spec,
Surma:but it is quite a lot.
Surma:But I did learn a little bit about it
Surma:while I was building my WebAssembly bundle analysis tool.
Surma:And so I thought I would give, like,
Surma:a really quick summary of what is in there
Jake:Hmm.
Surma:to just see where native land, I guess,
Surma:has ended up to do these kind of debugging procedures.
Surma:And I'm guessing at the time,
Surma:because it was like born out of an ad hoc idea,
Surma:it wouldn't have made sense to use DWARF.
Surma:But now that I look at it,
Surma:DWARF, because it is so widespread and supported
Surma:in so many compilers,
Surma:there's so much tooling out there
Surma:that it might actually have been beneficial to use DWARF.
Jake:Oh.
Surma:But that's, you know, like hindsight is 20-20.
Jake:Well, they might be
Surma:So maybe...
Jake:looking at it again, so yeah.
Surma:So like DWARF itself is...
Surma:I don't want to go too deep,
Surma:but it's basically a tree data structure
Surma:of what they call DIEs,
Surma:debugging information entries.
Surma:And those are really a bit like JavaScript objects.
Surma:They have attributes,
Surma:and these attributes have values.
Surma:And there's a whole bunch of predefined attributes
Surma:with specific meaning.
Surma:But as far as I can tell,
Surma:you can also add arbitrary custom attributes
Surma:for certain reasons.
Jake:Hmm.
Surma:So in the end,
Surma:what you kind of get is just debugging entries,
Surma:and they form kind of a graph.
Surma:And I could go a bit more into it,
Surma:but I think it's actually not that interesting.
Surma:If you have a WebAssembly binary,
Surma:or actually any native binary with debugging symbol,
Surma:there is a tool from LLVM.
Surma:So that's, you know, Clang, Rust compiler,
Surma:all LLVM-based.
Surma:So usually you have this on your system
Surma:called LLVM-DWARFDUMP.
Surma:And if you give a binary to that,
Surma:you can just get a big console dump,
Jake:Hmm.
Surma:and you can kind of scroll through
Surma:and see what it means.
Surma:But basically each debugging entry,
Surma:each of these entries in the graph,
Surma:has a source name and a source file,
Surma:and a range or a list of ranges
Surma:of where this debugging entry
Surma:is occupying the binary.
Surma:So it's actually a bit the other way around.
Surma:You don't go through the binary
Surma:and have mappings from segments to source file,
Surma:but rather you have a debug entry
Surma:that is like a function or a block or a variable.
Jake:Hmm.
Surma:And it tells you all the locations in the binary
Surma:where this entity is appearing in different ways.
Surma:So for example, if there was a function called main,
Jake:Oh.
Surma:that would usually just have, you know,
Surma:source file, the name is main,
Surma:and then probably just start at an endpoint
Surma:because the main has been mapped to here.
Surma:Then inside this main,
Surma:there could, because it's a tree,
Surma:there would be sub entries,
Surma:whereas like in the main function,
Jake:Hmm.
Surma:I have a variable,
Surma:and that is used here, here, and here.
Jake:Hmm.
Surma:And so that is already something
Jake:Hmm.
Surma:that Dwarf can express at source maps.
Surma:Not necessarily could,
Jake:Hmm.
Surma:but you have this tie back to what is the same entity.
Surma:And so I was like, oh cool, this looks easy enough.
Surma:I'll just iterate over all the entries in this graph
Surma:and count the bytes by the ranges they have.
Surma:But the result was that my total byte count
Jake:Hmm.
Surma:was a multiple of the actual file I was trying to analyze.
Surma:So clearly I was double, triple, quadruple counting bytes.
Jake:Hmm.
Surma:And that actually makes a lot of sense
Surma:because we are now in compiled language land
Jake:Hmm.
Surma:and a bit like what Clojure did,
Jake:Hmm.
Surma:they do a lot more than what just bundlers
Jake:Hmm.
Surma:and minifiers mostly do.
Surma:They do stuff like function inlining.
Surma:So what used to be a function call,
Surma:rather than doing a call,
Jake:Hmm.
Surma:they just copy the code from the function
Jake:Hmm.
Surma:at that place where you call the function
Surma:to avoid the overhead of jumping to another function.
Surma:Then Rust has generics or C++ has templates.
Jake:Oh!
Surma:There's this thing called monomorphization
Surma:or specialization where you create an instance
Surma:of a generic function because you're using it
Surma:with a specific type.
Surma:So if I have a function called add
Surma:and it takes any type, when I call it with a float
Surma:and then I call it with an integer,
Surma:this function will be duplicated once for floats
Surma:and once for integers.
Surma:So this function now exists twice,
Jake:Oh, interesting.
Surma:even though the source code, it only exists once.
Surma:And then, you know, compilers in general
Surma:are allowed to like reorder and fragments.
Jake:And I guess this is a little bit like when, well, in Emscripten,
Jake:which I presume comes from the C compilers,
Jake:where you can compile for performance or you can compile for file size,
Jake:and I presume that's one of the differences,
Jake:like the compile for file size is not going to do function in Lightning,
Jake:but that is at the expense of performance.
Jake:Right! Okay.
Surma:Exactly.
Surma:Yeah, so these flags with like what you optimize for
Surma:is a stiff and optimization passes.
Surma:And that's exactly it.
Surma:If you optimize for size, you're saying,
Surma:I'm okay having slower code if the binary is small.
Jake:Mm-hmm.
Surma:And let's go, okay.
Surma:Not only am I not going to inline,
Surma:but if I find the same code in different places,
Surma:I'm going to create a new function and call it
Jake:Oh, nice.
Surma:rather than leave it.
Surma:Like, I don't know if that is exactly what's happening,
Surma:but I would assume that would be quite a trivial optimization
Surma:to add to save file size.
Surma:So that's what these flags usually decide,
Jake:Mm-hmm.
Surma:which kind of passes are enabled or disabled
Surma:and how are they configured.
Surma:So luckily there is, I want to give a shout out to Gimli,
Surma:which is a rust crate to read dwarf files,
Surma:but even more so that they have, right, right.
Jake:Oh, I get it. I get it.
Jake:Yeah. Yeah, very good. Very good.
Surma:And they have an example program.
Surma:Like here's how you use Gimli correctly.
Surma:And an example program they have is add to line,
Surma:address to line, which you give it a dwarf file
Surma:or a binary with a dwarf file,
Surma:and you give it an address,
Jake:Mm-hmm.
Surma:basically a memory address of where a function
Surma:or whatever is.
Surma:And it tells you which source file
Surma:and which line did this come from?
Surma:So I was basically able to use most of that code
Surma:to now iterate through all the memory addresses
Surma:that the program would be loaded to in memory
Surma:and figure out which source file does that come from.
Surma:So now we're kind of back to the previous approach
Surma:I took with source maps,
Surma:which byte corresponds to which source file,
Surma:but still it gets tricky with this monomorphization stuff.
Jake:Mm-hmm.
Surma:Like when you create a new version of a function
Surma:for a specific type,
Surma:because this could be a generic function from a library,
Surma:even the standard library,
Surma:but now it gets instantiated
Surma:for one of your types specifically.
Surma:So do you attribute that to your library, your code?
Surma:Do you attribute it to the core library,
Surma:consumer price?
Surma:And then after all the specialization kicks in,
Jake:Right!
Surma:now optimization comes.
Surma:There was a really interesting example I found.
Surma:Like imagine you have a function
Jake:Mm-hmm.
Surma:in your standard library to iterate over a 2D grid.
Surma:You give it like a 2D grid
Surma:and it calls a callback for every cell in this grid.
Surma:And this function, quite sensibly,
Surma:could be used for making each pixel in an image brighter,
Surma:but also could give each unit
Surma:in a specific region more health.
Surma:It is very generic and universal.
Surma:But let's say that me using this function
Surma:to make each pixel brighter gets inlined.
Surma:So rather than calling a function
Surma:and having a nested for loop in that function,
Jake:Mm-hmm.
Surma:now this nested for loop is in my code.
Surma:And now my callback is not a callback,
Surma:but it's inlined there as well.
Surma:And after optimizing it,
Surma:it figures out that part of my callback code
Surma:is actually independent of the inner loop
Surma:because there's some pre-calculation per row
Surma:that I could do.
Jake:Because it feels like we've got a similar problem
Jake:in sort of JavaScript standard stuff in that, like,
Jake:because of tree shaking and dead code elimination,
Jake:the size of, you know, some library's impact on your project
Surma:Right.
Jake:can be attributed or can be impacted by the other things that are calling it,
Jake:which, you know, you can say, like, this library is 300K,
Jake:but it's only 300K because it's being called by this, you know,
Jake:or it could even be, like, you know, an import star
Jake:and then an iteration over that,
Jake:or something, like, has actually caused the size of that to explode.
Surma:Yeah.
Jake:Is that similar, or am I kind of way off?
Surma:Somewhat.
Jake:Way off.
Surma:Somewhat.
Surma:Like, that's why I didn't find bundle-phobia super useful, because it tells you this npm
Jake:Hmm.
Surma:package is this big.
Surma:I was like, but, well, there's like three different versions in there, and hopefully
Surma:it's three shakables if I only use one function.
Surma:Like that's fine.
Surma:So that's why I think actually analyzing your bundle, your output, is a lot more useful
Surma:as a data point.
Surma:But obviously it's, you can't say in general this library is big or this library is small.
Surma:And you can't do that in native land either, because exactly that, things get removed or
Jake:Hmm.
Surma:reordered.
Surma:The problem here, just what I realized, and this is for example I think not something
Surma:that JavaScript bundlers or minifiers do, nested loop has been inlined, my callback
Surma:has been inlined, and now some of my code has been moved in between the loops, but not
Surma:all of it.
Surma:So what used to be a very clear call stack, I call a function, that function calls my
Surma:function back, now these quotes have been interleaved.
Jake:Hmm.
Surma:So there's not just a clear mapping of range to range, but now you have an interleaving.
Jake:Hmm.
Surma:And DWARF can express that as well.
Surma:And that's something that source maps I don't think would be able to, apart from the fact
Surma:that I don't think this kind of optimization happens very commonly in JavaScript lands.
Surma:So yeah, that was just something I found very interesting.
Surma:And also, when you ask for, in this code that I've used, what is the originating source
Surma:file for this range, it doesn't just give you one file, it gives you a whole list of
Surma:files, because inlining may have happened, it actually does happen quite commonly.
Surma:So it will say, yeah, this address has four source files, because it was your main function,
Jake:Hmm.
Surma:but in this specific POC spot, it was originally calling a different function, has been inlined,
Surma:and that function, also called a function, has been inlined.
Surma:So you get this whole stack.
Surma:So it makes it a bit unclear how you attribute the file size, like I just did whatever address
Jake:Oh, that's fun. Yeah.
Surma:to file decided to do, and it seems to be good enough.
Surma:But it's quite interesting to look at WASM files and see that sometimes code from a different
Surma:library is attributed to my code or the other way around, and it's just something you have
Jake:Yeah.
Surma:to deal with.
Surma:There is, I don't think there's a very clear, correct answer in every case.
Jake:I had a similar problem once when I was, like, trying to...
Jake:Because a lot of these bundle analyzer things,
Jake:like, you're dealing in source bytes, right?
Jake:And sometimes, you know, something can look really big
Jake:compared to another thing, but one of them gzips really well,
Jake:and that's what we really care about, because that's the bytes over the wire.
Jake:To some extent, you know, obviously gzip bombs are bad.
Jake:But, yeah, you get that similar attribution problem
Jake:because the order of the modules in a single file,
Jake:like, you know, once they've been bundled together,
Jake:that matters, because suddenly it's like,
Jake:oh, this first one is massive, and this second one is really small,
Surma:And then, you know, then there's stuff where the compiler has to inject stuff, it's just
Jake:but you put them the other way around, and the back reference is, you know...
Jake:It's kind of like, who owns the data being back referenced, right?
Jake:It's kind of... It depends on the order.
Surma:like in JavaScript land, where, example, a pure annotation, that the function is pure
Surma:if you build a library, that has no source file, that's like a compiler-emitted part.
Jake:Thank you.
Surma:So those bytes are often without any mappings, and there's similar stuff in native land as
Surma:well.
Surma:So I was not really giving a good understanding of how DWARF works, but it is definitely a
Surma:lot more capable.
Surma:And looking back at it, I can see how DWARF would have been overkill for JavaScript, but
Surma:at the same time, I do wonder, with all the tooling that already exists, maybe it would
Surma:have been useful.
Surma:Although I have to say, I could not find a tool that I can give a DWARF file or a binary
Surma:with DWARF in it, that tells me which files have contributed how many bytes to this native
Jake:You
Surma:binary.
Surma:And so I had to build that myself, like this kind of, I guess, in native land, this kind
Surma:of bundle analysis isn't common, because only web is a streaming platform, right?
Surma:I guess the one thing, it sets it apart from all other platforms.
Surma:And so that's, in the end, what I wrote.
Surma:It's called Wasmphobia, because even though I don't like bundle phobia, I thought the
Jake:Nice
Surma:name was quite funny.
Surma:And so I published it.
Surma:I also added source map support, because that was, after I was done with it, actually comparatively
Surma:easy.
Surma:And the fun fact is that, you know, the WebAssembly binaries can have DWARF embedded, so you can
Jake:Oh
Surma:just drag and drop a DWARF binary into Wasmphobia, but DevTools doesn't support DWARF.
Surma:So when you step through a WebAssembly binary, it has its own format for the name of a function.
Surma:So you see sometimes the name of the function, but you don't know what the source file is
Surma:for this function.
Surma:And, you know, Rust and C++ mangle function names quite excessively to encode what the
Surma:module is that it came from and what the parameter types are.
Jake:Oh
Surma:Because in C++, you can have the function with the same name, but with different parameter
Surma:types for dynamic dispatch and stuff like that.
Surma:But luckily, at the time, Inga, who was on our Squoosh team and was a WebAssembly advocate
Surma:on the Chrome DevRel team, wrote a Chrome extension that teaches DevTools about DWARF.
Surma:So when you install that and you debug a WebAssembly binary, you now actually can step through
Surma:your original source code.
Jake:Oh, that's good.
Surma:And when you throw an exception or something, you get an actual stack with your source files
Jake:Oh, why is that not just in the browser?
Surma:and lines rather than just random WebAssembly verses.
Jake:That seems essential, right?
Jake:But, yeah, very good.
Surma:I know, right?
Surma:I don't think variable inspection works.
Surma:You know, in JavaScript, when you step through, you can hover over variables and see what
Surma:the values are.
Surma:I don't think that works because it is kind of language specific.
Surma:Because like basically DWARF tells, yeah, this is a variable.
Surma:The original name is whatever.
Surma:And its type is Rust colon colon string.
Surma:So unless you know what the exact data structure of Rust colon colon string is, you won't know
Surma:how to visualize that.
Jake:Of course.
Surma:But just being able to step through and figuring out where an exception is being thrown or
Surma:something like that is already incredibly useful.
Surma:So yeah.
Surma:And I do not know why they kept it an extension and not just looped it into DevTools itself.
Surma:But that just deserves a massive shout out because I didn't realize it would actually
Surma:make my exceptions nicer.
Surma:And that's really valuable.
Jake:At the same time, I just did a sneaky bit of searching,
Jake:and, yeah, source maps is now in TC39.
Surma:Oh, cool.
Surma:We should link to that.
Jake:Maybe has been for a couple of years. Yes, we will.
Jake:It's stage zero, but they're definitely working on it.
Jake:Like, the last commit was three weeks ago,
Jake:so it's sort of being actively worked on, which is really nice.
Surma:That's more activities than some other stage zero proposals.
Surma:Yeah.
Surma:So this was my overview of what I wanted to call debugging on the web and how does DevTools
Jake:Absolutely.
Surma:know what code to show you and also why I guess we both have beef with source maps.
Surma:We both have beef with source maps and I really hope that in the standardization that
Jake:Oh
Surma:they learn from where source maps are falling short and then either make something that
Surma:is long lasting and doesn't have these problems, or maybe they should use DWARF.
Surma:I don't know.
Surma:I mean, source maps are already big.
Surma:DWARF is also big.
Surma:I don't think you win or lose a lot either way.
Surma:And using existing format is usually much more beneficial than inventing something new.
Jake:Oh, absolutely.
Jake:It's had more hours of work put into it.
Surma:Yeah.
Surma:DWARF is extremely battle tested.
Surma:So I feel like it.
Surma:I don't know.
Surma:I'll guess we'll see what they do.
Jake:Excellent. Well, we'll link to your Wasmphobia thing,
Jake:which is, yeah, it's a lovely little project.
Jake:It is impressive, because my assumption was always, like,
Jake:well, once it's a binary, then your ability to really understand
Jake:where things in there came from,
Jake:I just assumed that that was impossible now.
Jake:But it's, yeah, and especially on the web,
Jake:where we do care about those bytes,
Surma:Yeah.
Jake:a lot of the image encoders and decoders we were dealing with
Jake:would have, like, except PNG as an input,
Surma:Yeah.
Jake:but we knew, like, we were just able to pass in raw bytes of an image
Jake:and sort of determining, like, have I definitely created a build
Jake:that doesn't have the PNG encoder and decoder in there?
Surma:Yeah.
Jake:And I was just like, I'm going to assume,
Jake:because I added these flags and the bundle is now smaller,
Jake:that I have succeeded.
Jake:But I don't know, it feels like it should have got more smaller
Jake:than it did, but I don't know.
Surma:You know what?
Surma:And now we could actually probably figure this out a bit more because we could because
Jake:Yeah.
Surma:Emscripten emits DWARF as well.
Jake:Oh, nice.
Surma:If you add the dash G full flag, I think I have it in the readme of isomphobia and we
Jake:Ooh.
Surma:could drop it in and we could actually take a look of like, is it a PNG dot header file
Surma:or PNG dot C files that they still include it somewhere.
Jake:Yeah.
Surma:I actually would be more confident that I can inspect what's in there and we could figure
Surma:out whether there's more that we need to do.
Surma:And I guess it's again, it's the same thing in native land.
Surma:This kind of tree shaking is just not a requirement, not something that is being thought of when
Jake:Yeah.
Surma:you write code, because even in Android land, right, like whether your Android app is eight
Surma:megabytes or 12 megabytes, nobody cares.
Surma:I mean, there is a limit at some point where you go like, wait, why is this app to take
Surma:notes?
Surma:120 megs.
Surma:But if you below a certain threshold, nobody cares really about squeezing out the last
Surma:couple of bytes.
Surma:But on the web, it's very, very different.
Jake:And I think we were both impressed, like,
Jake:somewhere where it did happen differently,
Jake:we were both impressed by Animal Well,
Surma:Yes.
Jake:the kind of Metroidvania game, which I haven't played yet,
Jake:but I've watched videos of it and it looks really good
Jake:and I really want to play it.
Jake:But this is, like, when you get a game on the PlayStation,
Jake:it's, yeah, it's 100 gigabytes or whatever,
Jake:and if you dare install it from a CD,
Surma:Yes, easily.
Jake:because you think that, a CD, a Blu-ray, whatever,
Surma:Yes, easily.
Jake:because you think that's going to be quicker,
Jake:it's still going to be hit with that 50 gigabyte update
Jake:afterwards, you know?
Jake:But Animal Well is 34 megabytes,
Jake:which is, like, for a game these days,
Jake:that's smaller than Minesweeper is on Windows these days, so...
Surma:That's more than some web apps that are out there.
Jake:Oh, many, many.
Jake:Funny detail of it is, like, on the PS5, it's not 34 megabytes,
Jake:it's 101 megabytes.
Jake:But the theory is that, you know,
Jake:when you highlight the game's icon on the PlayStation,
Jake:you get, like, a background image shows up,
Surma:Yes.
Surma:Yes.
Surma:Yes.
Jake:which would be, like, I don't know, 4K or 8K or whatever,
Surma:Yes.
Surma:Yes.
Surma:Yes.
Surma:Yes.
Jake:and the theory is that's most of the size increase.
Surma:Yes.
Surma:Yes.
Jake:So that background image that appears on the PlayStation menu
Surma:Yes.
Surma:Yes.
Surma:Yes.
Surma:Yes.
Jake:is actually bigger, bigger than the game itself, yeah.
Surma:Yes.
Surma:Yes.
Jake:Which is, you know, that is the right way around.
Jake:Like, there are so many websites where a video of the website
Jake:ends up being smaller than the code to make it run.
Surma:It always makes me think of the demo scene in the mid-2000s.
Jake:Oh, yeah.
Surma:There was this game called KKrieger, I think, which was 100K and had, for the time, stunning
Surma:visual 3D graphic.
Surma:And it was basically like a Doom first-person shooter.
Surma:And, you know, obviously the enemies were quite dumb, but it was incredible, it was
Surma:super impressive.
Surma:Actually, I'm going to link to that as well.
Jake:And things like Frontier on the Amiga,
Jake:where it's seemingly the universe fit on a floppy disk somehow.
Jake:There was one...
Jake:I mean, we talked about it on the show before,
Jake:but there was a few games on the Commodore 64
Jake:where they would give you a game to play
Jake:while the game was being loaded,
Jake:which I always thought was, like,
Jake:well, that's a lovely little bit of progressive loading,
Jake:but there was another reason that they did that,
Jake:and the reason was the hardware code on the Commodore 64
Jake:for loading data from the tape or from floppy disks
Jake:into memory was bad.
Jake:Like, they hit a bug close to shipping the machine out,
Jake:and it was like, this bug's so bad,
Surma:Wow.
Jake:we've got to do the whole loading thing in it
Jake:in a different and terrible way just for it to not be buggy.
Jake:And then so after it shipped, people found,
Jake:it's like, well, we could load in a loader that can do it better,
Jake:because better ways of doing it on software were found later on.
Jake:So while you were actually playing this little game,
Jake:it was actually also loading its own loader,
Jake:which could load the rest of the game into memory faster.
Jake:Yeah, which is brilliant, isn't it? I love that.
Surma:Some of these tricks at the time, I remember, I don't know which game it was, I'll look
Surma:it up, but there was a game which, again, shortly before shipping, was crashing when
Surma:you finished the game, like when you beat it.
Surma:And they couldn't figure out why it crashed.
Surma:So basically when you finish the game, it's just like, segfault, blah, something.
Jake:Oh, yes.
Surma:And so what they did, because at the time this was possible, when they loaded the game,
Surma:they would find that string in memory, like the string that contains the error message
Surma:for a segfault, and overwrite it with a message, thank you for playing whatever the game was
Jake:It was Wing Commander.
Jake:I'm sure we've talked about that on the show before.
Surma:called.
Jake:Yeah, thank you for playing. Wing Commander was the game.
Surma:And then they just ship it.
Jake:Yeah, yeah, I love that game as well.
Surma:Really?
Surma:Brilliant.
Jake:Nah, well, whatever.
Surma:Well, then we can cut this.
Jake:No, look, look, people have said,
Jake:like, one of the things they like about the show
Jake:is it's kind of like, it's almost like, you know,
Jake:they're just sort of down the pub with us.
Jake:And if there's one thing I do down the pub,
Jake:it's tell the same story over and over again.
Jake:Absolutely.
Jake:Oh, absolutely.
Jake:Well, with that, then, we shall say,
Jake:happy next time!
Surma:What a brilliant point to end this episode, I think.
Surma:We've told the same stories, so before we lose more credibility, we can call it.