River Room 3 - DDD Perth 2024 Livestream

rw-book-cover

[Music]

h

oh [Music] [Music]

[Music]

[Music]

[Music]

oh

[Music]

[Music]

[Applause]

[Music]

[Music]

[Music]

I for

[Music]

the

[Music]

[Music]

d

[Music]

[Music]

h

[Music]

oh

[Music]

[Music]

oh

[Music]

[Music]

a

[Music]

[Music]

n e

[Music]

m

[Music]

m

a

[Music]

[Music]

I

[Music]

oh

[Music]

to d

[Music]

d

[Music]

la the

[Music] [Music]

[Music]

[Music]

[Music]

e

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

n

a

[Music]

h

oh

[Music]

n

[Music]

[Music]

he

n

[Music]

e

[Music]

n

[Music]

[Applause]

[Music]

[Music]

m o

[Music]

n

[Music]

n

[Music]

[Music]

[Music]

[Music]

[Music]

w

[Music]

[Music]

[Applause]

[Music]

[Music]

la

[Music]

for for

[Music]

[Music]

the

[Music]

[Music]

n

oh

[Music]

[Music]

oh h

[Music]

oh

[Music]

h

n

[Music]

[Music]

[Music]

n

[Music]

oh

[Music]

a

the

[Music]

a

[Music]

[Music]

m

[Music]

n

a

[Music]

oh

[Music]

d oh

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

is

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

for

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

a

[Music]

[Music]

[Music]

[Music]

[Music]

oh

[Music]

[Music]

n

[Music]

a

[Music]

oh

[Music]

h

[Music]

h

[Music]

oh

[Music]

a

[Music]

[Music]

oh

[Music]

I

[Music]

n

[Music]

n

[Music] a [Music]

[Applause] [Music]

[Music]

n [Music]

n [Music]

n

oh [Music]

n

[Music]

[Music]

[Music]

[Music]

[Music]

n

[Music]

n

[Music]

oh

[Music]

[Music]

[Music]

[Applause]

[Music]

[Music]

h

[Music]

e

[Music]

[Music]

la

[Music]

emergeny like stress out I'll go over

here for now but um can we turn this one

on okay he can't see me we're going to

roll with what we got I don't know if

you can get the sparkly stickers but if you don't get them I bought a bunch of sparkly stickers from the speaker room so come grab them after if you'd like if you're a sticker collector um can we switch to this one please thanks okay cool yeah so I'm Simone I'm can you hear me yeah nodding nodding in the crowd thank you um I'm

principal consultant M from Sunny Brisbane which is um apparently bucketing rain today so thanks for the great weather and for having me in Perth it's lovely um it's comparable to brison otherwise and the lots of nice water and Parks I'm going to do that a lot of times um and I love really inefficient insecure non-cost optimized modes of transp which is ironic when you think about the talk but you know it's all about balance and um I started my career uh managing Windows NT networks and token ring like my very first job was crawling around in the roof trying to

find the token ring part of the network that a rat had eaten so the whole network had come back online so if you want to know how old I am that's how old and if you don't know what that is ask someone who looks old um but you know it's the era of skin care so hopefully you can't tell um and so I would love to get to know you all individually but I really hate audience but participation when I'm in the audience but just so that I know where to focus this talk um if you're a developer who's doing infrastructure either happily or begrudgingly could you

just like sneaky okay a awesome um and then if you're an infrastructure person who just loves infrastructure and does infrastructure a day it's any of my peeps yeah um and so I always get really nervous during these talks because I'm an infrastructure person and I sneak into developer conferences and I always expect people just like hiss at me and throw tomatoes and stuff but sometimes this get a good um result so we'll see how we go um just so that you have the opportunity to sneak out if it's not the right talk for you um this is what I

hope to solve with this talk so common issues like scalability so you're having problems uh adjusting your resources to meet demand or security vulnerabilities um if you're if you work one of these companies I'm sorry you had a bad year um uh yeah so or maybe you have high operational costs there's plenty of startups I've worked with in my time that have like their products great but their cost is really high and they don't know how to get that down um maybe you have reliability concerns your facing downtime oh I just have to pause I did

this branding Workshop I was out of my zone and she said when you're on stage make sure you make triangles and have gentle hands and I was like who's thinking about I'm just trying to do my anyway photographer's here so yeah um every every photo of me I'm like in the photos everyone you'll see um um yeah so you've got reliability concerns or you've got monitoring gaps so like people are telling you your stuff's down and you don't know it's down until they tell you um or you've got performance

shortfalls uh so you're experiencing slow application response times poor user experience or you're like many of the people who work in the cloud when you're like in the olden days you had really definitive user guides like when I did VMware rollouts um there was a admin guide and it said configure your CPU like this configure your RAM like this this is how your network looks that's it do it like that or don't and if you don't want to do it like that you're on your own but now it's like here's 9 million blocks and do whatever and it depends and when you ask for advice we're go it depends and you're

like that's not super helpful and so you end up in this um cost time scope Consulting triangle where you either have dedicated people who are experts in all the things in every team or you end up with um you know developers who are burnt out and have like tons of scope and don't actually get to write code or build their app because they're too busy trying to figure out how does the network work or how to fix DNS or whatever or you end up with time where you have um a central team in the infrastructure side of things but

there's a bottleneck there because they can get through what they can get through so maybe like a cloud team or Sr team or something so if you don't have any of those problems this talk will be super boring for you um and I won't be offended if you sit at the back and play your phone or Mo walk out of here because there's heaps of good talks on today but um I aim to help in this talk um find ways to reduce cognitive load and decision fatigue um to make to make sure your workloads are well architected um I hope to help minimize delays in

bottlenecks so you don't have to have that Central team doing it all you have to wait all the time for help maybe avoid dup gated head count and having count in having um you know specialized people in every single team and um also try to avoid that rework so like you build it and then you give it to someone to check and then they come back and say oh no that's super unsecure and you have to start again which is really infuriating it's slow um and that's where we end up here so with the well architectured framework there's one for Azure there's definitely one for AWS and if you look at the doco

they're surprisingly the same um but I would say that the AWS one's a bit more concise and the Azure one's more of aose which I think is just kind of their brand but no matter what cloud you work with they both have one and they have the same pillar as last time I looked and they both have a cloud adoption in a well architected framework and I just want to quickly set the Baseline um is everybody here familiar with Cloud adoption framework and well architecture framework okay I'll I'll just explain them when I first looked at

them I was like how are they different so I'll quickly so you have um Cloud adoption framework which is your journey to the cloud if you've ever worked as a salesperson they've definitely said that to you A lot of times um and that's like how are we going to adopt Cloud what's our security across the whole Cloud what's our cost across the whole Cloud you know what's going to be moved and what's not going to be so it's like end and entire Cloud journey and then you got your well architector framework which is like that one stack of workloads that you own how do I keep that well architected and that's what we're going to focus on

today I read this thing that said if you're nervous before you talk do some push-ups I like I can't do push-ups but also like everyone's looking at you so I'm just G to try and breathe um anyway so these are the pillars uh for well architected reliability security operational excellence cost optimization and performance efficiency and the only one I'm not going to spend a ton of time on is security because there's like Dev seops and there's some talks you can go to that just get right into the weeds and they're awesome and I just think you can't really cover Security in a you

know small amount of time but the other pillars I feel are a bit neglected so I'm going to spend a bit more time there and the first thing you need to do if you want to have well architected workloads or you don't know what you don't know um is set a baseline so when I work with a lot of isbs and startups when I worked for Microsoft the first thing we did was was uh well architected review or or a go live review um and you can see on this slide there's a ton of them there's like a SAS one an AO one a data one so you can pick your flavor but if you

imagine it's like having a consultant come in and just fully audit your environment and it goes through your policies your processes it gives you a bunch of recommendations that you can export as a CSV or a PowerPoint if you want to present it say to an executive team or something um and then you can prioritize those put them in your backlog and start working towards them and you get a score and everything and it's really good it's free if you don't have a like infrastructure person to come and fully order your environment or you don't have enough time it's a really good place to start especially the go

live one if you're like a SAS provider or something and you're just about to go live it gives you that really good Baseline but a common problem I used to see when working with small small medium businesses or startups when I would do this with them is that they would um consider it a onetime activity so like I had a customer in New Zealand who had this great product they had one pilot customer on there but if they added another customer they couldn't afford to keep running because their infrastructure is so expensive so we did well architect and focused on cost

optimization but I doubt they ever went back and did this again right they just went problem solved and carried on from there um and so they don't revisit and iterate on it because it is quite um resource intensive because it is a really in-depth audit um or they found that they get like 500 recommendations and a lot of them are very low but they're like like you know that's just too many it's too hard so they just don't know where to begin or it's a point of friction so they do it and then it's like what we meant to just go back and fix all this

stuff and it becomes that rework you know that Tech debt um which is not well received so the next best option I have for setting a baseline is azure advisor it's enabled in your portal it's free it's on right now the difference is it's based on 30 days worth of data or things you have deployed in your portal now so it's not going to be like what policies and procedures do you have in place how do you roll with this what's your Disaster Recovery plan but it will look at everything you have deployed and give you real time updated um recommendations

you can set it to send your alerts you can set it to go to your help desk system whatever so it is a really good low barrier to entry easy to use way to keep your environment well architected and what I would suggest is when you look at advisor and you get your 9 million recommendations look at the high ones so filter It by High High severity and pick the pillars that are important to you and put those just maybe pick five or 10 things and work on them so this is what you'd see in the portal and when I used to work with

customers a lot I would say you know you want to get about a 75 at a minimum just like a Microsoft exam but um you can tell when you break down the pillars that this one even though they've got 78 is probably not ideal because security is woeful and no matter what you're doing you probably want slightly better security so um it's good to look at the the pillars pick the high severity ones but I would always pick cost and security as the very very top ones because they have really big impact and then depending on your business requirements reliability or performance

efficiency might be more important but pick your cost and security and try and get them as good as you can look at the requirements some of them will have like a suggested fix you just click it and resize a VM or something and others will tell you how to fix it put them in your backlog and you can work from there and you can see the score improving so you can go back to the person who pays for your work and say like we improved these things and he's a nice graph and what have you so the next thing I'm going to do is just quickly dig through the pillars I've taken longer than I anticipated so

we're going to quickly dig through the pillars um without going too deep but how I would try and make you stay well architected with the least amount of effort possible so the first one is performance efficiency so it's about scaling horizontally testing early and often and monitoring the healthier solution so that's like the infrastructure and all that kind of stuff um and before I go ahead there's this um thing on social media at the moment where it's like you ask your partner how often do they think about the Roman Empire and apparently guys say like weekly do you think about the Roman Empire weekly can you please just give

me a acknowledgement yes there's a lot that's the answer I wanted for my joke to work um I don't think about the Roman Empire ever which is so interesting right but do you know who I do think about all the time oh sorry the liability pillar overlays they go together but anyway I think about tick um because every time there's a a concert or a show or a sports game like you can't get tickets and the website crashes and it's on the news and people having parties and they got all their iPads lined up and blah blah blah and how can we get the ticket I'm just like

what you know people want Tay T tickets right like why would you just pre-scale it the day before I don't understand and I was saying this to um my team and my colleague Ryan was saying yeah but you know like it's part of the hype they want it to be on the news they want it to be hard and it's all that P gu and I'm like oh yeah I guess that's fair in ek's case but most businesses want you to come and give you money and as many times as you want give me money like they don't want it to be hard they don't want perceive scarcity um and a good example of that

is skims which is Kim Kardashian's um loungewear company and you can get like cute little um matching outfits with you and your dog for Valentine's Day and they had this um drop recently and as soon as they dropped it they were hyping it up for months like all these celebrities soon as they dropped it the website crashed and it was offline for eight hours and social media was going berserk which is why I followed it because it was hilarious but again I was like what you didn't think anyone was watching your ads your skim cardash give me a break um so this was this

experience their customers were having they're like you know you're going to get traffic like your website can't handle it and I was like I agree Kylie I agree you're a booing you can't afford a few scale sets come on man Kendall classic gen Z just chaotic and um with the bows you know and then Michaela someone do a mental health check on Kayla because she was going to have a full-blown menty be if she didn't get to match with her dog for Valentine's Day and I feel like that's not the customer experience we're looking

for also put m b into it one time I put a leave request in and I just wrote menty B smiley face and my bus approved it anyway I would suggest trying that um yeah so availability and resiliency doesn't equal reliability right so she after eight hours they got the website online excellent H but as soon as I got the website online um the e-commerce provider or the database backend I'm not sure which obviously couldn't handle it because everyone's orders were still timing out after 40 minutes or their

orders went through and they got charged but then they got an email the next day saying oh no we can't fulfill your order or they got empty boxes in the mail with just a packing slip or they got other people's orders it was just a complete disaster so even though the website was now available I think we could all agree it was not reliable which is why I think those two pillars kind of overlap for me because you kind of can't have one without the other and if you don't like the Kim Kardashian example because it um you know maybe it's not the right demographic uh there's one with a car

and that is like say your team is going to a concert together and you get a flat tire does your team know that you have a flat tire do they know that you have a Jack in the boot so they how use a jack do they does the spare tie have air in it do they have change a TI blah blah blah and after a while do they know just to cut their losses and take a cab or a bus or whatever over to the concert so when you think about disaster recovery and resiliency like have a backup plan but then have a backup backup plan and know how to execute one of those

plans uh and how you could do that in your environment um one of the best ways I could uh suggest is to use chaos Studio aure load testing so you can put synthetic load onto your environment and you can use as a chaos Studio to just wreck your environment it's probably like chaos monkey you probably all heard about that with Netflix which is pretty cool um um pretty similar name I don't know how they get away with that but uh yep so aure chaos studio and Azure load test a really good way to create synthetic chaos in your environment and load and make sure you're not a skims of

2025 um and the next thing is holistic observability is part of that pillar and when I talk to people who don't do infrastructure they're like yep there's TW more than 2,000 services in Azure how do I know what to Monitor and what do I care about there's um the cloud adoption framework patents and practices team has a thing called Azure monitor Baseline alerts it's free it's open sourced and every product group owner at Microsoft has to own their product and they put the alerts that they think you should have um for that service in there so

what it does is create a bunch of aure policy then if I create a storage account the policy will apply apply the alerts for things like um soft deletes being turned off or whatever so it just built in for you and ready to go it's just a oneoff deployer use Powershell it's very easy um and it's easy to delete the policies that you don't want or edit the alerts it also sets up the action groups and everything for you so check it out uh and then also you know how we talked about advisor in the portal if you want to go in deeper there's uh the

Azure advisor uh workbooks and they go in more depth so this is a reliability one and something you could say is most people wouldn't know that you need to change the type of hard dis that you've got so that you get a better SLA like even I don't think about that but this will tell you you know change the hard disk type and you get 99.9 % SLA for no effort so the workbooks are really good there's a cost upd one a service deprecation one a bunch and they keep adding them in so it's not automated but it's like available for free and right

there uh if you want to know what's happening within your workload so that stuffs like what's happening in the Azure infrastructure plane but if you want to know what's happening within your workloads no matter what platform and them Jake and my team loves open Telemetry um he will talk about it all day so find him he's the organizer here uh but you can build it into your things so that you can get the Telemetry and the information you need out of your application and know what's happening and keep it well architected and then conversely um a lot

of other people I work with in the dev space love application insights and I don't mind this too because when I have to do containers which I'm not an expert at it gives me like all these pre-made workbooks with all the performance and everything else in there you just turn it on and then the workbooks are there with all the stats and data that you might might need to um keep your infrastructure running really well cool also if you want any of these slides and you don't somehow get them just find me on LinkedIn I'll happily send it to you so you can just get all

the links Okay cool so next is operational excellence so that's again about that holistic observability um but really I think the key about this one is devops and reliable predict predicted automated deployments or the current buzzword pave pass and platform engineering but we do this same thing and call it slightly different things you know round and around we go but for me the key for this is building a well architectur stuck so keeping things um well architected and

having sensible defaults in place so that you don't have to think every time you go to deploy storage what settings should I have um it's just sort of preconfigured for you without thinking about it and then also keeping it well architectured so reducing Tech debt not just having that be a one-time thing that I think about when they change TLS one to two you know having some way to keep set up dat and um I really like this quote from Kelly shri's article because uh it backs up The Narrative of my talk but also it's a really good article and it says

that our common enemy is unintended behavior and that we should build standardized patterns and build standardized solutions for cross concerns or those paved roads and patterns so the way I would do that if I was you is the way I do do it is um to use infrastructure as code I'm team hashicorp terraform but you can use whatever you want I don't care um but infrastructure is code is the way to go plumi arm bicep open tofu that's if you want to be a RAB um and if you say to me terraform

isn't really a solution it's just a language that's fair um so I would suggest looking at the Azure verified modules I don't know if anyone ever used the caramel repo oh some NDS but it was like a bunch of preconfigured things that you could deploy but no one really owned them and it was open source and they went a bit out of date but these these ones are owned by the Microsoft product groups so there'll be things like um resources like storage account or patterns where it's a whole solution and someone from the product group has approved all the settings and it's well

architected and it fits together with all the other ones so you can use that there's also the development environments in um CLI templates but I don't love them as much because you deploy them from your local machine and I'm a big fan of Change Control and deployment Gates and all that infrastructure sis mini stuff so if you're building proof of Concepts or just trying to see if something will work the Azure development environments and CLI templates are little preconfigured Solutions like virtual desktop you just click it and off it goes um but verified modules a really good way to have those pre-made Lego blocks to build your environment in a

well architectured way so that's what they look like you can see there's bicep and terraform the resources are individual things then you got patterns which are a whole entire solution and then the utility modules are little tiny modules that reuse over and over again this is a developer environments and CLI templates um which I'll skip through because for some reason I normally talk so fast but I'm burning through time but um and then you say to me okay cool but I've already built my environment click op sit in and it is

there and I don't want to go back and write it as code and I've definitely had to do that take a customers environment and try and write terraform to match it and it was a mission so I understand why you wouldn't want to do that um also if you see people have click Ops and you I welcome you to say welcome to click Ops baby which I always say work and no one ever last anyway um so there's some tools I really like to use and when I saw this one um a terrifier was his first name I was like yes finally you point it with po shell to a resource Group or list of resource

groups and it will take that stuff and write it as code for you and it's about 80 90% of the way they're perfect so it's way better than having to just look at what's there and try and figure out how to write it and test and test and test if you're a arm and bicep type of person that's cool there's a Ops which is also cool free open source both of them are free and you can contribute to them create issues on the repo whatever actf export has been picked up as a proper product so you get faster replies on that repo but use whatever works for

you um also in preview so you know if you go RC portal in AO and you get the preview stuff there's a new preview um for terraform you could always go into the portal and say export to bicep or export to arm but I was super happy to see the preview now where you can go export to terraform so this is a storage account um you can just click it you can see the whole aure RM provider and everything that's configured for it which is amazing because sometimes the doco is a bit LAX and you don't really know what settings it wants or how what

format it would like its values um kind of a API I think is sort of where they're headed as a future over RM um and so but it's really hard to find what an earth values it wants because the doco is really scarce so the export to aapi is cool open your C it do whatever you need and it's fully configured the way you had it in the portal and then the final part of that piece is um the using devops so I'm a big fan of deployment Gates I'm a big

fan of teams being able to contribute wherever they need you're not having a bottleneck so you shouldn't have to ask the cloud team to fix a missing tag you should just be able to create a PR fix the missing tag and carry on with your day right so the best way to do that is to have CICS GitHub gitlab Circle C whatever you like I'm te GitHub but honestly it doesn't really matter as long as your whole team can use it and contribute cool so that's how you can you know sort of boost your journey to having a well architected stock but how do we keep it well architected and the best way to do that is have guard rails

in place and this is sometimes a contentious topic because a lot of developers don't want guard rails they just want freedom um but so this is hopefully a good way to find that middle ground and also this is why you do need guard so stripe is um you know those little square white things and you tap at the market or the coffee shop or whatever um in 2022 they processed nearly $817 billion dollar worth of payments which is a far cry from 2019 when they were new and they had a 2-hour outage and they lost thousands and thousands of their small customer base

who never returned because if you could imagine you're at the markets or the coffee shop you can't take payments for two hours like you're not stoked on that right it's not a good experience and so they committed to having operational excellence as their key pillar um and they wanted to build systems where the easiest path is the safest and most reliable one and they do that with rail so they make it hard to disable health checks or skip checks when deploying or override those sensible defaults which is why guard rails are really important for that reason and now the API is

99.99% reliable even during peak times like a drop for Valentine's Day they're good they can take your money all day long so um you want to build guard rails in your platform the best way to do that is with the Enterprise um policy as codee repo if you use the cloud adoption framework plates for um The Landing zones the bicep one or the terraform they use this repo as well so all the built-in policy and Azure and all the custom policy anager is there as code you can either use that or you can copy

it and modify it but it's a really good place to start with policy as code and um some Dev teams I work with don't like policy because they're like all this stuff just magically happens and I don't know what's going on and why that's changed but if you have your policy is code your team can see what's happening they can see what the policy is and if they don't like something they can make make a here and be like I don't want my you know Azure container apps to get tags automatically because it breaks it so they can you know exclude it or whatever so if you have it as code it's documented and everybody can see it's not some magical thing that just sort of

drives people crazy and AR when I build the landing zones oh hopefully you can see that is that too tiny oh no it's okay um it's too tiny okay cool just um put you enough of those on um so at kind of we do that with uh our learning zones so we have for example one to enable mandatory tags because tags are super important but they're not something that you think about often when you're building environments and so we have um the API

versioning which is cool because dig you know you can version policy now which is really really cool no more automatic surprise updates and then we um apply it to our root or platform Management Group and then we apply the tags that way I just walked through my demo but yeah and so you can see in the portal that it has a policy initiative here with 18 policies to apply tag so anything you add if it doesn't have a tag it will tag it for you from the subscription and inherit it down so how that looks like not just as

a blocker code which I often find not that useful if there's no context is we build um management groups for our guardrail so we have um that top level there we build the platform Management Group with our connectivity identity and management resources in a central place or governed by their own set of guardrails and policy and security then we have that overlying you know tagging cost management alerting that sort of security that runs across the whole platform as um a whole and then we have subscription so we use

subscription vending to Stamp Out subscriptions so if you want to build a new app you get your own subscription you're pretty free within there but those guard rails apply from the top down for things like cost alerts and tagging and any other mandatory security stuff like that um and so it comes together as this big thing and there's a there's one on the cloud option framework website that just looks bananas but that's a bit more of a simple view so this is where we put our policy as code so that goes across everything but we apply it at the Management Group level and we might

apply less restrictive policy to our um sandboxes not no policy though because they're a great place to hack and build crypto Mining and and it's a good way to get into your environment so not no policy but you might set it to order as opposed to you know deny so people can play around in there and then we would Auto remediate Stu anywhere else that we would consider production or staging so all those things if you don't have tags it won't stop you deploying it but it will go ahead and rectify the things that we want to happen across the board and then we use policy to create

alerts for the owners of subscription so I'm a big advocate for not just sending alerts to a centralized team or you'll probably find if you look at your Azure setup now that all the critical alerts go to the person who owns the tenant which is like a CIO or something usually who doesn't check them um and so I used to get a lot of customers at Microsoft who have Services deprecated and didn't know even though they've been getting emails for six months because it goes to like the the person who pays the bills and they're like and then next minute or their service are turned off so we try to make sure that the person who owns

the subscription or the team gets those alerts so we do that with policy as well every time subscriptions created more Alerts get created for that team um so there that's what we want to have guard rails in our platform to make it hard to override sensible defaults and apply them and then we want to have guard rails in our pipelines because that's a really good place to um I hate saying shift left because everyone hates that now but you know it's a good place to put them so to put guard rails in your pipelines we use um all the terraform tools I don't know if you use

terraform but no one's looking super enthusiastic but anyway we use terone validate Tero format and TF lint they're all free and you can put them into your uh pipeline it will format your code nicely so everyone doesn't matter what your preference is it just formats it to the standard TF lint does great stuff like check um for unused providers or deprecated things or whatever it's really good um if you use bicep or arm there's a tool called PS rule it's written by Bernie white in Brisbane it's really cool um has every single pillar

of the well architector framework and all the recommendations built in you can use it post deploy so if you're doing terraform or click Ops or something else you can use it for that but if you have arm or bicep you can put it directly into your pipeline and it will tell you if things are not well architected straight off the bat which is excellent um and then the cloud adoption framework team if you want to see what they do they've got a repo um here and it has all of the tools they use so you can see they use tfint and um you know tfse and all the normal tools that we

use as well so if you I find that's a good place to see what new things have they've come out with and just steal what they're doing um and finally to test your code as you go we've got terraform test um and so terraform test is like you can assert that a website comes online when you build it or you can assert that um um something has a right name but I think a lot of things you might not want to use terraform test for if you want it to be consistent across all your repos use policy if you just want to test things within that one repo use terraform test or the language of your

choic as equivalent okay security we're going to be super quick on this one as I said there's a whole practice Dev SEC Ops if you want some talks on it um feel free to contact me and I'll send you all the links of the talks I love but I'm not going to attempt to di dive into security but I will share with you my model in life which is don't be on the news so when I think about about stuff like should I do this or that I just think will I be on the news and if I will be on the news then it's a hard um know for me and if you don't want to be on the news don't be one of these 81% of developers that were surveyed that said

that um they admit to shipping software vulnerabilities because of business pressure I guess so just think about news versus business pressure and I always choose not being on the news but no shade if you're one of those developers because B you know the project man is on your case everyone's on your case um it is hard and also a lot of developers I talk to they do use these tools so they have they have security scanning in their pipeline already but what I hear is that they just get a lot of things and they don't know if they're important um and they don't know how to resolve them and so

that's a real problem they've got these great security tools but it's just a lot of noise and it just stops them doing what they want to do which is build their thing the only solution I have suggested that is um dependabot in GitHub I don't know if anyone's played with it but I really like it because it instead of just saying ah there's all these problems it actually creates a PR for you it tells you what's the problem how to resolve the problem creates a PR that you can just merge in um and it can be enabled like with one click across all

the repos you have on in GitHub so you don't even need to enable it per repo which is cool so you can see here I've got one of Jake's um P one of Jake's repos if you if you really like the thing that you've got like you want log 4J in your your repos and you don't and you just want it to be vulnerable fine you could just ignore that but um most people would just go ahead and merge those in cool this is the last one we nearly made it let's see how we go uh and cost so I did cost last because I think every

pillar has a cost portion to it right like you can improve cost throughout just like you can build security into everything but really cost is about getting value for money and I think that a lot of people infrastructure Dives anyone who's building stuff it's generally not what we think about until maybe a month after it's deployed and you get get someone wondering down being like why is this $60,000 and you're like it's resilient um and so in oftentimes when you go to a talk and they talk about a new feature and you say how much

is that that would say I don't know like but it's cool so um I don't think we really think about the class because it's not really visible to us when we're building and architecting workloads um and so article by duo lingo can I go over here yeah that was weird um I'd like to wonder you know I'd be trapped uh there's a good article by dualingo and they talked about how they solved literal hundreds of thousands of dollars per month by just doing a cost

optimization exercise which is really cool um and they said the first step was just to understand um every dollar they're spending where it's sneaking off to and who's using it or how is it changing over time um and the very first way to understand cost and where your money is going with tagging I'm sure we all know that and as I talked about earlier there's more than 2,000 resources in Azure unfortunately you can only tag 625 of them directly and so that's why that policy is really helpful so it will inherit the tags down from the subscription or Resource Group so you have some chance of not having you

know 1500 resources that you don't know where they belong or what they should be charged back to or if you could delete them um and that's oh yeah that's turned on using the policy if you want more guard rails uh this is what I was talking about when we configure those alerts for our subscription owners so we configure budget alerts per subscription so when you get to 70 60 80 90 100% of your budget it will send you alerts or you can have it configured to go to your help desk system whatever however you rooll slack teams um have it have it

surface to people when they start to hit those budget or get close I've had teams like accidentally leave synaps running over the weekend for example when you come in on Monday and the customer's like hey uh why is our bill $20,000 and like no one knew because you didn't get any alerts so um you can configure those using policy and they're just across theard if you talk about cost and performance efficiency um there's a great talk by sahil Batel he talks faster than me which I love his talks I don't have to watch them on two times speed I'm just engaged the whole time um and he talks

about how they save tons of money and um that he finds when he works in it we don't have a ton of regard for the performance per unit of the computer that we're buying and so like we don't take advantage of the hardware that we're using we we still optimize our apps and our workloads for spinning discs um and older fashioned C use systems but now with things like open Telemetry app insights and Cloud you can kind of um improve your code or improve your Hardware really easily you're not constrained there anymore and it said that honeycomb save

30% increase in their price to performance by switching over to the new um CPU just a simple change like that one click and they they save 30% and I was chatting to my boss Lee the other day and he was saying same thing I think it was app services or something just du to the next thing and they got nearly double the performance for the exact same price so don't discard that as an option um and on the performance efficiency pillar with cost optimization this is further on the DU lingo one they had a legacy service that was making 1.2 billion calls per day that they didn't need to it just had all these extra

calls when you called it just called everything and then they had another service that was making um constant requests for resource that didn't change so they just changed a time to live from one minute to an hour and they reduced their traffic by more than 60% and if you look at this graph you can see where those billions of dollars because you pay for every call right right so um just having some Telemetry and some tagging and you can be like who we don't like do we need that many calls per day um so when we talk about operational excellence and cost my favorite thing my

favorite tool is infra cost Hopey you can see it because it's quite bright in here but um so INF cost is really cool it goes in your pipeline so we just have in our pipeline you get a pat token it's free for I think a, calls which we never hit um and you just set up info cost you tell it to uh do a diff so what have I got deployed and what am I about to deploy and then create a comment in your PR and that looks like this so you can see here I just turned on everything in our Learning Zone for the example to be dramatic um and you can see that my new

cost when I'm just about to push this change is more than $1,000 more and you can go well I didn't mean for that to happen and you can see that I'm deploying a firewall and a Bastion and a bunch of public IPS and a container registry and everything else is turned on and you can say ah yeah right I didn't need need two firewalls or whatever and I didn't want my cost to change by $1,000 per month um so it's it's free you can put it in your pipelines now and it can be a PR so it's really helpful we just do it for our internal Dev one obviously not in our

customer deployments but I would suggest checking that out also if you want the cost to be more direct you can get the vs code plugin so every time you type add resource this is AWS example um it pops the cost up the top there um in total monthly cost for the instance is $700 so you can say oh actually maybe I don't need that one you know maybe it could be smaller or maybe you do but at least you can make an informed decision and it sort of puts it more in your face not a month later when you have to go back and change

it um if you're using armor bicep there's um cost estimator which is free and open source it's not as mature as INF cost but it's free and it's there and it works really well as well this is completely not automated and put it every talk I ever talk about because when I talk to customers reservations is like oh they're too hard they I'm I'm trapped I'm locked in but really you can reserve everything that's on this screen and most people just think it's like VMS that you can reserve everything and had a customer recently saved $23,000 a month by reserving storage um so it's worth having a look

you go into the portal you click this it will tell you what you can reserve and what your savings might be based on what you have deployed it's not automated unfortunately but you will get alerts for this from advisor so if you turn on the emails he will email you and say you could reserve this and this and this and save $7,000 or whatever and then you can go in the the port and click around you can trade um reserves instances you can get refunds for them they're not as scary and locked in as you might think and that you you will save like the bulk of your money there so it's not

automated I know it's a bit boring but just so spreading the good word you know um so the when I'm looking to optimize costs the way I do that is using Azure advisor recommendations as an Automation Way so push that stuff back out into my team without us having to go look for it um and then things like scheduled shutdowns you can automate that and tagging you can base it on tagging like if it's Dev it doesn't need to be running we all know that one but then um some more uh you know purposeful things you have to go look for is decommissioning things or right sizing things or looking

those Azure advisor alerts or your Telemetry and seeing where you might be able to change your Hardware or um reduce your time to live you know to one hour and save a billion dollars yep um so I rushed like a maniac no he time these are the reliability security cost upto oper and performance efficiency um and there's good doco on the Microsoft page about the tradeoff so say you want a really secure environment that's not going to be the cheapest and if you want a really cheap environment it's not going to be super reliable or

resilient so if you need help to understand what pillars are the most important to you because you need to figure out which um recommendations to prioritize the tradeoffs are really good in helping you step through that it's the most corporate boring slide but it's um good info and to summarize you just need to to start by setting a baseline so we use the well architector review for really in-depth policies and processes everything we have deployed option or you can use advisor for a really quick and easy way to get started you can use advisor the workbooks in advisor budgets

and alerts and the Azure monitor Baseline alerts to start to understand your environment and um see what you have and start to understand what's happening then you can Implement guard rail so it would suggest policy as code and things in your pipeline so that you can keep your cost optimized keep your stuff written as code you can convert your click Ops workloads back to code so then everything's documented and managed with deployment Gates and change control and if you need help to get started you can check out the Azure verified modules

and finally you can use um pave pass and repeatable infrastructure um with devops and infrastructures code to set up automated testing and linting make sure your stuff is streamlined make sure it's load tested make sure you have that chaos in there or you understand how to recover it and make it resilient um before we on I just want to say thank you to the sponsors uh amazing thank you for having me here from brzy um also thank to the room sponsor software one I did a mental health check on Michaela I think she got her matching outfit because she looks really happy so

don't be worried about that for the rest of the day um and that's me so please find me on LinkedIn or blue sky if you want the slides you have questions or if you think um I missed some excellent tools or you have feedback on the talk I love feedback even if it's constructive not mean just constructive um yeah so hit me up thank you so much thank thank you Simone for a great talk uh we running out of time we quickly have to pick up one or two questions raise your hand and then the

mic will come close to you you have to ask questions as oh this one oh oh no triangles I don't know I didn't listen hey this is not a super important technical question but I feel like there's going to be a joke on the Roman Empire that we missed out on the joke was just that like I just don't understand why you think about it every week like St understand but anyway you're talking about well well paved templates so there's a a link there

between Roman Ro all right I'll work on my jokes for the next time I give this talk I love a good I can't think of on my head I'm not that witty there's someone um laying down with their hand up at the back I B them but they still got a question do you have any tools or um approaches for testing policies for testing policy um how do you mean like just making sure it deploys probably and doesn't break stuff um maybe the only thing I could

suggest is something like terraform tests where you deploy it goes and deploys all your infrastructure in a test way and then you use the assert to say make sure this happened make sure that happened so that way or other times I've just done quick and dirty power scripts that run as part of my Pipeline and be like does all this stuff belong where it is and is it how I want it's probably not super sexy but that works as well but yeah just the testing terraform test is pretty extendable yeah oh yeah of course that's a right answer you can put it in order and then

just manually go make sure it doesn't bulk your environment yeah the the versioning really helps so now that you have the versioning for policy you don't get those surprise like oh one minute it just doesn't work anymore and you spend a day so you can pin your policy to a version and then make an educated decision to go to the new version which I find is really okay last question then uh we'll probably close from there uh so it's Tim here um congratulations on coming from infrastructure uh to do configuration code um my experience is that seems to

come more from the developer side and infrastructure it's quite tough to to get them to convert um is your is your experience that the companies you're with that uh there's challenges there but they were challenges there but they're getting better and everybody from that sort of side is joining in because I have seen companies where that's happening then I've seen other ones where it's still stuck and um I see a lot of people who are The Click Ops team and they go why would I do infrastructure code and I'm like oh I didn't realize we was still having this

discussion because like I just click a button but honestly most people I know was this I mean as soon as po shell you could soon as I didn't have to click that wizard N9 million times in six million windows that was about it so but I do find that like sometimes if you've come from sment and you're doing physical Hardware blah blah blah it's like it's a little bit hard but I was was fortunate to work with really great cross functional teams and like I never used terraform and in a few weeks I was up and running because I had a team who was really supportive and happy to answer my what seem like stupid questions so I guess it's more just like

your company culture if you've got a culture where asking questions means you're not um a 10x engineer probably they're not going to try anything new if you got a culture where the people who love what they do are happy to share that and Coach other people you'll everyone will quickly get on board that's my experience but okay right thank you yeah thank thank you please give a big Applause for Simone and thanks for coming apprciate it if you don't uh have self if you want to get in touch uh please make sure that you took a take a

photo and otherwise she's around and then you can find her and then discuss offline thank you thank you so much sparkly sticker

e

e e

all right guys come in quickly before we

start our second uh run of the session today e

well thank you guys for coming for this uh second uh round of the series uh my

name is Michelle niku I'm a research fellow at the University of West Australia today I'm the MC of this show uh I'll be uh introducing our second speaker who Aiden z g he is the lead consultant cognizant serving with a diverse industry background ranging from startup scale embedded iot and Industry 4.0 through to entreprise scale utex companies and

utility providers his expertise lies in Cloud native secure software Dev with a keen interest in exploring emerging technology please give a big round of applause for Aden take it out cheers everyone cheers everyone um so yeah my talk is called curse things to do with Lambda functions and I typically give a lot of talks around what makes good practice what's what's

good exaptive practice from other you know areas that I've worked in and so this was a really fun talk to put together because it's from the point of a Villain Like what is the worst things we can do with Lambda functions what does Lambda technically allow us to do that may not be best practice so to start off with I just wanted to thank all of the sponsors um I don't get to subject people to my cloud opinions often enough but um thanks to all of these sponsors I do uh and specifically to the room sponsor software one really

appreciate it um and yeah go and check out all the sponsors booths as well um they're all over on the Terrace so yeah make sure you check them out over lunch or any of the breaks so uh who am I I'm a lead consultant at cognizant um not cognizant service anymore we're just cognizant um but I'm A Cloud native and software and security specialist and from a Consulting lens I guess all you need to be a software consultant is pretty much just need to know the words it depends

that doesn't look right and you probably don't need kubernetes um I have lots of opinions and hopefully some of them are right uh you know or at least useful I'm hoping the ven diagram between this talk and the ones I have that are actually useful is close to a circle and I'm also a low-level programming language Enthusiast so I come from a embedded iot background so my first love was always C there's exciting stuff happening in that space But I won't get too bogged down into that right now so the concept behind this talk why

do we care about doing things badly in Lambda right and we care about doing things badly because by testing the bounds of a system we build mechanical sympathy and so uh it was a term originally from racing uh mechanical sympathy right like you don't need to know how a Car Works to drive a car right but understanding how a car operates makes you a better driver so that's what we're going to do today but instead of driving race cars we are going to be driving Lambda

functions and so this might sound like this talk is going to be really boring now right like how far can we push Lambda with hello world and so that was that was the concept that I started with and I wanted to see how far we could take this and so we're going to use a very simple use case but we're going to use that simple use case to do some fairly complex stuff and ous Lambda is less constrained than you might think right so we normally think of Lambda as this service that we bundle our code up into we've

got a very constrained runtime uh you know it only has a couple of different uh languages that you can run on it it's usually a certain version all of that sort of stuff but there's actually a lot more leway uh than you might initially think so the first model we're going to look at for Lambda functions is this AWS sends a request to our code our code processes that request and returns a response to AWS right and so very simple model and to oops sorry to uh yeah to

illustrate this we're just going to use a very simple hello world function right and this is probably about as simple as it's going to get today uh we receive an event and then we return just a uh Json object um that has our status code 200 and a body with a another Json object of message hello and whatever the name was that was passed into the event right so let's prove out that model which way is my edos console here it is now I didn't record my demos so

fingers crossed they all work so here we go we've got our function in here I might try and make that a little bit bigger same function we had before I've got a test event defined I can test that event and I'll say hello test because my name in my model works right uh Lambda is operating as we expect Lambda to Lambda to operate but I haven't actually been entirely honest right that model is probably a little too simplistic right

that's not what we're here for so what we're going to look at is this right so what actually goes on behind the scenes is Lambda instantiates a container that container contains a runtime which is something provided by ads right and then that runtime is what actually interfaces with with our code so on the Lambda side all Lambda does is start a container and host an API and that API contains uh end points to get the next event and an endpoint to post a response back from

our Lambda right but we don't see that API at all the Lambda uh runtime abstracts all of that away from us our code just receives an event and returns a response so this is what's actually going on into the hood right and if we dive into that a little bit it's just an API right why do we need the runtime at all why can't we just interact with an API directly and so that brings us to our first cursed Lambda function which is lambda. sh so you've probably all

heard that you can actually write a Lambda function in bash I'm not sure how many of you have actually gone and done this but it is something you can go and do so what we're going to do is uh we're going to instantiate a container our container is going to contain a bash strip there's no runtime instead we're just going to call the uh next endpoint to get our next event and then we're going to do some processing in bash and return a response so what that looks like is something like this which is kind of

awful because you know we're interacting with apis in bash and it's never fun uh so down the bottom we've got all of our stuff that is our Loop to continually keep pulling the API for additional uh events and then when we do get an event uh all we do is we call our Handler function up the top we get our event data and then I've actually got a Lambda layer in here which contains JQ does everyone know what JQ is yeah it's just a Java uh Json processing Library uh that you can use from bash which is

really handy and all we do is we check if our event has a property name and if it does have a property name then we return the name else we return world so exactly the same thing we had before but now uh in bash so on I think the next slide is yes let's prove our model exactly the same thing that we had for the original one so we going to bash hello world uh I'm unlikely to get as much nice stuff in here

but there we go if we test this I've got a test event configured I did test this before I came in as well we can see I've added my name in so our response is hello Aiden I didn't return ajacent object just because I was lazy but yeah we can do exactly the same thing and uh get very similar source of response from A bash driver uh Lambda function so this is actually how custom run times work under the hood so if you've ever run a custom runtime for Lambda functions all they're doing is

talking to the API calling your code um and then returning the response back to that same API so this is really great for legacy code that can interact with HTTP that maybe doesn't actually have an implementation on Lander so you can can spin up a container you can put your legacy code in it and you can write a thin wrapper around it in order to actually uh make it compatible uh with lampda and interact with the uh broader ads ecosystem and that's all well and good

right but I haven't shown you how to do anything that really breaks the mold yet right all we're doing is stuff that you can do in bash or stuff in custom run times but what if you still need to use nodejs right you're not going to start a container that has no JS in it because is one that's probably going to be inefficient and two um it's going to be a massive container so that brings us to curse two which is breaking the mold with rid scripts so I haven't actually been

entirely honest we're going to refine our model a little bit more right and our new model looks like this it's pretty much exactly the same we instantiate our container but now right up the top of our container we have our uh rapper script and then that's going to instantiate our runtime and our runtime is what is going to call our code so uh this is for when node options isn't enough right so there's certain things in ad West that you can put in

node options there's certain things you can't um and I do have a demo lined up for this but in the space of me writing this talk and actually uh giving this talk ad W has actually made it so the flag that you I'm going to use as an example can actually be used in node options but but 3 weeks ago when I wrote it um it wasn't but that's the benefit of working Cloud right there's always updates and always new stuff to try out so here is our wrapper script right all I want to do is expose the garbage

collector to my no GS Lambda function so all I do is I've got uh the existing ARS I take the extra ARS which is just our exposed GC and then I insert the extra options into uh the call that's instan the runtime and then we can EX at the runtime right so uh of course 3 weeks ago you weren't able to do this through using the node options environment variable so but you were able to use it from doing a raper script so sort of AD W doesn't let you but you still can sort of do it now they let everyone do

it so then in our Handler what we can do is we can then call those garbage collection functions such as global. GC and run the garbage collector on demand so let's let's try that out right so we've got our garbage collected Hello World function and pretty much exactly the

same thing as our first model that we proved out except now we have global. GC being called and so we can test this and hopefully we get a response back and it's hello garbage collector there we go and if we test this a couple of times we'll see that GC is actually run every time so we're getting 125 milliseconds in terms of our execution time which is a lot slower than we would on our base lamda function so if we go back to our original Lambda function and start that test

again we' expect this to execute much faster because we're not running the garbage collector every time maybe every fifth one might be a little bit slower um let me just run it again just so we make sure we've got a fresh container I know it's uh about the same time okay well normally when you run it maybe it's uh yeah I'm not sure what's causing that but yeah normally it'll execute in a couple of milliseconds um on the uh Bas Lambda cool okay so no more tail latencies due to garbage collection anymore right uh so if you're profiling

a garbage collection this is very useful tooling because now everyone gets garbage collection so tail latencies completely disappear because every latency is now a tail latency so I can do stuff before the process starts which is really handy right but it doesn't really say anything about doing anything in the process so uh that's all well and good but let's uh you know let's keep that in our back pocket for now because that's going to come into uh what we're going to be doing uh a little bit later

on so then I want to bring us to curse three which is all about uh logic in extensions so uh I haven't been entirely honest what actually happens is is we have ads that calls our uh Lambda API Lambda service instantiates our container inside our container we've got our rapper script that uh instantiates our runtime our runtime runs our code but we can also have this stuff over to the

side here right so we can also run stuff as a sidecar to our to our code and this is s kind of limited in what it can do because uh it's got this uh limited API that it can talk to it doesn't talk directly to um the events API sorry the uh yeah the invocation API it talks to the events API so you can get stuff about when uh events were invoked and all of that but you can't actually get the event itself

so what we can do is well why limit our logic to our deployed function right when AWS gives us all of these new and exciting interesting places that we can possibly put it so here we go we've got a hello World function that is written in Rust so all we do is we from the body we get the name if we have a name we format a string that says hello whoever if not we uh return hello world and then uh we

return a uh response either status okay and our Json response that's right it was actually a Russ talk all along right I yeah I left that part out at the start but yeah lowlevel languages have always been a passion of mine so it has been a rust talk all along so if we put our logic in Rust we can then call that logic from our actual

Lambda function and what that looks like is something like this right so what we've done is we've created a rust extension that rust extension starts up when our Lambda starts and then uh what we've configured it to do is that that rust extension will actually uh instantiate a local web server just local to the Lambda function and so all we're doing is in our actual Lambda function the actual deployed code that AWS is running uh what we're going

to do is talk to that local um web server on the path hello world uh send it a post with our uh body which is our name and our event. name property and then all we're going to do is proxy through the response that we get from our web server as a as a text response so let's prove that model now there we go extension hello well so here we go exactly the same

Lambda function except there's no hello world logic in this Lambda function at all it's all it all exists uh in the deployed web server on the Lambda function uh so I've got one called hello Aiden here I assume that's got my name in it there we go hello Aiden as you can see the duration isn't great uh it takes a little while I didn't mean to deploy that that's run it again hopefully it runs a little bit faster there we go 168 so we've put our logic in an extension but you know it's still not very fast

it's it's still not um as performant as it is running our logic directly right so even though there's new and exciting places to put our logic that doesn't mean all of our logic should go there now I want to raise a interesting point right like what happens if we take everything we've just talked about and wrap it into one really big cursed Lambda function like what is the most cursed thing we can do with

Lambda and so that brings us to curse four which is dynamically modifying invocations on any runtime or architecture which is sound bit of a mouthful but it's it's it's good fun um so how did I end up here I originally gave a talk at uh hashy talks and it was basically about this right normally when you run a Lambda function you've got your Lambda code you might have Amazon API Gateway further Upstream you might

have uh you know a database further Downstream and then you know in your Lambda layer you might have all of your libraries and all that sort of stuff which is is really cool but what I really wanted to do was I wanted to have a way to add security by default to uh these Lambda functions fun and so rather than getting a developer to at the top of their Lambda function say you know do authentication or something like that I wanted something that was a little more um resilient I wanted to make it so the developer didn't need to remember it instead if they didn't want security I

wanted them to have to think about actively removing that security and so I sort of built this solution based on uh some stuff I'd seen New Relic do about instrumenting Lambda functions and basically what that was was when the Lambda function was deployed we did this and we actually put our Lambda function Handler inside our Lambda layer and then our Lambda layer called our Lambda code and that way we could dynamically wrap anything uh that was built uh in our Lambda code with functionality from our

Lambda layer and so yeah that was that was really cool um and really interesting but it was sort of limited right so it was run time dependent you had to have the correct runtime and a Lambda layer that worked with that particular runtime right um it was fragile every time you know something changed or a libr was updated or a no JS version changed um you might have to rewrite your layers and then redeploy them all and of course it becomes this whole thing which is difficult to manage it was also kind of

hacky because you know who puts their Handler in their uh Lambda layer not the best place to put it but uh it's a it worked and you could also technically inject modules via envas which you know made the made the functionality work to inject our handl but is also sort of worrying from being able to inject other things all right so what's the alternative in this scenario if we want to do a similar sort of thing but uh don't have the or sorry don't want to

expose ourselves to all of these sorts of risks right and so I realized that uh from this slide there was a bunch of stuff here that I didn't like about it but I couldn't get rid of all of them I really had to work out which one of these I could put up with and which ones I wanted to get rid of so the alternative was doubling down on kind of hacky right so this is where it sort of gets a little bit dicey a little bit you know a little bit weird but it still

works well I haven't been entirely honest right remember how I said that our extension could only talk to the events API and get events about you know invocations and stuff like that but it couldn't access the actual invocations themselves what if we did this what if instead we put our extension between our runtime and Lambda and then in our wrapper function we tell our runtime that AWS actually exists inside our

extension and then we proxy all of our code through our extension to the ad uh Lambda runtime API so what uh yeah so and then any architecture is kind of cool as well because uh we're looking at running extensions at binaries right so these are rust binaries that we've created and the thing about of course compiling these things is when you choose a Lambda runtime lambda's compiled uh that

runtime for that uh particular architecture right but we don't have that luxury so if you're using arm 64 or you know x86 64 then you need a different binary for each one but this is one of the cool things that I found in the ad documentation was you can actually do something like this where all you do is you find out what architecture you're running on and you compile two binaries so one for arm 64 and one for x86 64 and then uh you

launch the version of your extension that corresponds with the current architecture so that's uh really cool and of course we've got unsupported architecture down there just in case you know someone does something weird like Risk 5 um but yeah so we can run these sort of side cars right and any runtime is the other question right so so far I've showed you things that have existed in you know either we've sort of wrapped nodejs or we've put an extension in or you know it's it's been dependent on uh

what's been running but can we decouple ourselves from being runtime dependent in this scenario right and so because the rust SDK for AWS is GA we can write a whole bunch of stuff that interacts with AWS services and then these Services um of course we can do basically anything we want in abos with the rust SDK so this is the first part of the extension right what we've got is we've got a route and all we're doing is

anytime we get a request that looks like it's meant to go to the runtime API we intercept it and send it to our own get invocation next function uh and so what we have in this get invocation next function I haven't put the whole function just for brevity sake but uh all we have is we've got a we get the name property from our body and if we have a name then we return uh the name Jason or send yeah we Jason

format whatever it is and rust and then um if we don't have anything then we just send an empty uh response through and away we go right so if we have a name property in our body we'll change it if we don't have a name property the body just gets proxy through as it is and then what we do is this is our rapper script right so if you try and add AWS Lambda runtime API as an environment variable in um in the actual

Lambda console then Lambda will say no no no no no don't do that that's a bad idea and I agree it is a bad idea but we're going to do it anyway so what we do in this rapper script is we just set the environment variable ourselves AWS Lambda runtime API to our local Lambda function to our local Lambda extension and then we exit the runtime as normal so we now have arbitrary control over the interaction and so going back to our

first example right um our first example was just a very simple hello world right so we have a Handler when we just return a response with our body hello event. and so let's let's prove out our model right let's let's see if it all works when we put it all together proxy hello world JS so just to show there's no funny business going on this is our Lambda function here hello

event. name if I go to the test page the name is Aiden so we're expecting hello Aiden from just the code that's deployed not including the extension and if I test this fingers crossed uh hello Aiden and rust so we've intercepted the call as it went between AWS and our actual runtime and the cool thing about this is if you look at that duration that is actually significantly better performance than pretty much

anything we've seen so far right so we're actually not incurring a huge performance over here doing all of this sort of stuff so it unlocks really sort of interesting possibilities uh that we wouldn't have had otherwise so uh let me go back to my slides yeah I did say any runtime right and so far I've only shown you no. JS so who knows I could have written something in no. JS and we don't have uh any way of knowing unless I actually wrote in python as well so that's what I've done uh we have a Handler that is in Python

and just returns exactly the same thing we've got our status code and our Json body with hello event name so we can send exactly the same event through to python now the other thing I want to uh call out here is if I go to configuration it's been a while since I've actually had to use the terminal uh there we go architecture for this one is arm 64 right so I've been running no js on arm 64 just because if

we're going to do a different runtime why just change the runtime when we can change the architecture as well right so let's go python x86-64 so jumping into proxy hello world p we've got exactly the same thing once again no cheating going on um if I open up our test function we've got name here is test and if I test this we'll get hopefully

execution succeeded hello test and rust so this is really cool right because it's not limited to just that particular runtime and it's it's very uh you know cross compatible with anything you want to deploy on Lambda so we've done it in Python we've done it in no we've done it on arm we've done it on X 8664 and this is all one Lambda layer as well by the way it's not like I'm switching out Lambda layers halfway in between so if I uh jump

into code you can see we've got our layer down here I've only registered as compatible with no GS but it does work with python as we just saw because this is the python function um it's compatible with both uh x86 and um 64 and yes it did take me 56 tries to get right please don't laugh at me um yeah so uh yeah so it's a really interesting sort of sort of concept right uh that's the slide I was meant to have up before I showed you the python

version cool so we've managed to do this right with dynamically modified invocations on any runtime or architecture so every the first three curses were maybe a little bit of a red herring they were setting the groundwork for this final pattern and so there's some really implications of this right so we're not only now limited to um you know stuff we can do in manage services and stuff like that we can put anything we want in between uh the Lambda API and our function so if you want to do defense in depth if you want to make sure that

you're validating like a JWT token or you know if you've got something that you uh you want to check for SQL injection or anything like that and you want to do it at the Lambda function layer rather than at a w layer or something else of course there's good reasons why you should do it at the W layer and at every layer but if you want to check that uh at every layer you can do defense and depth through this sort of pattern right and any of those sorts of strategies you can write in Rust and deploy to any architecture and any runtime uh tracing and monitoring is another cool one right so a lot of instrumentation for tracing and

monitoring comes through in um either as libraries that you need to import and stuff that you need to add in yourself or it uh you need to do specific types of logging and stuff like that and then the extension on the Lambda function talks to the um the Lambda logs API and then exports those out to your third party um tracing and logging provider and monitoring provider but because we can intercept the events themselves directly uh we can actually do all of that without integrating any libraries and completely transparently to the

developer right and then of course there is one other really good thing that this pattern is use uh useful for uh which is request enrichment so if you've got um you know like a user string or something like that in your headers or you've got any sort of uh information that you uh want to enrich or add additional context to then you can do that and provide that through to your actual Lambda function and the last thing that's good for and my primary purpose is making fun

PowerPoints right so uh yeah if you want yeah it's a it's a fun topic to play around with and I wouldn't recommend using it necessarily in production but it's a it's a pattern that's useful to no exists right and our idea wasn't to give you necessarily tools to take away and go use tomorrow but our idea was that we wanted to test the bounds of a system and we wanted to use those test that test to build mechanical Sympathy for the underlying systems that we use every day so yeah hopefully I've

achieved that I did need to go a little bit faster but I think I've went a little bit too fast now so we'll have plenty of time for questions um but yeah if you've enjoyed the talk uh this is my LinkedIn feel free to connect and I'll open up the floor to questions if anyone wants to have any questions about the code or about my sanity feel free to feel free to ask away thank you Aiden it's time to give questions from the floor just raise your hand and the mic will come to

you is there an advantage doing it this way uh for invent event enrichment compared to putting another lambra in front of it yeah so uh it really depends the way your lambas are set up right so if you're doing request enrichment through something like uh API Gateway and your API Gateway is awaiting response then whatever Lambda is doing the enrichment is going to be running for the same time

that it takes for the Lambda to do the processing actually runs right so you by chaining two Lambda functions together you're paying for twice the compu even though only one piece of computer is running at a time so there is an advantage there you could do it technically um like in any of the it depends what sort of request it is is like if it's a Cognito request you could do it in like a pre- token generation hook or something like that but for API Gateway stuff um it's kind of Handy to be able to do it all in one because you're not doubling up your compute by Waiting by chaining lambas together so it's a it's a cool way but yeah you could solve it in in different ways as

well same same same scenario any other questions no well if we don't have questions let's give a big Applause for aen [Applause] it's almost lunchtime so if you don't know your way for to get some food just uh turn right and continue working and

you can look at on your tag there it says FZ stand for Food Zone and then the number is your uh Zone where you pick up your food thank you [Music]

[Music]

[Music]

[Music]

[Music]

a

[Music]

[Music]

for

[Music]

[Music]

[Music]

[Music]

[Music] [Music]

[Music] [Music]

[Music] [Music]

[Music] [Music]

[Music] [Music]

[Music]

[Music] [Music]

[Music]

[Music]

[Music] [Music] [Music]

[Music] [Music]

[Music]

[Music] [Music]

[Music]

[Music]

[Music]

[Music]

m

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music]

e a

[Music]

n

[Music]

oh

[Music]

oh

[Music]

[Music]

n

n

[Music]

a

[Music]

n

[Music]

n

[Music]

[Applause] [Music]

[Music]

a [Music]

a [Music]

a [Music]

a [Music]

n [Music]

a [Music]

[Music]

[Music]

[Music]

[Music]

[Music]

oh

[Music]

[Music]

[Applause]

[Music]

[Music]

not sure if you've heard but at Bank

West we're on a mission to become

Australia's favorite digital bank and to

make banking seriously

uncomplicated imagine digital banking so

simple it makes sense to everyone

no matter who they are or where they

are that means

[Music]

me we know that uncomplicated doesn't

just happen it's

made not by a company or one individual

or even a bot it's made through our

curiosity honesty and desire for change in other words it's made by us we know coming the favorite isn't easy but when you're making a difference for customers colleagues and the community when you feel like you're learning and growing and your ambition is matched well then it's all worth it the truth is Australians deserve easy useful banking experiences this might sound like a big thing but for us it's

everything it would take all of us working together to make something we're all proud of and that's why we need brilliant people like thank you Bank West made by

us maker X is an award-winning research and development Studio building digital products and high growth Ventures at the frontier of Technology our team of 40 experts including 10 former cxos brings a wealth of Industry insight and advanced expertise including AI web 3 and Venture building combining an extensive library of robust Technology Building Blocks an entrepreneurial culture and a

proprietary R&D framework we create groundbreaking products open new markets for corporations and optimize existing businesses for those brave enough to do what's never been done get in touch to discuss how maker X can drive your business forward for inventors passionate about Fearless exploration and bold experimentation makerx offers a vibrant inclusive environment where Innovation and continuous learning are core principles here at maker X you'll

work on transformative projects ranging from virtual machine compilers to nonprofit mental health apps join us to help invent the future and make a meaningful impact in the digital world makerx made for tomorrow hi I'm Melissa Herrera from data Stacks a company focused on helping developers turn their dreams into wild new realities through generative AI we are here to make it a 100 times easier and faster for developers to build gen

and rag or retrieval augmented Generation apps and experiences that you can quickly get into production data Stacks offers a One-Stop gen stack and data API that gives developers all the data to build a complete rag app for a faster and easier path to production we've also added Lang flow making it easy for developers to build oneclick rag at scale with an open-source visual rag framework with drag androp data flows and pre-built components for

connecting to any API data source or database oneclick deployment and seamless integration with astrab for more relevant answers with ultra low latency giving you 20% % higher relevance nine times more throughput and 74 times faster response time at 80% better TCO data sex delivers a ragir developer experience with first-in-class Integrations with leading AI ecosystem Partners so we just work with developers existing stacks of choice

a [Music] [Music]

[Applause] [Music] [Applause] [Music] [Music]

why is mandal group such a great place to be gee that's hard where do you start immediately what comes to mind is the people might sound like a cliche thing to to say but genuinely I believe that we've got the best people not only technically and and from a Consulting perspective but just generally as

people mantal group principle that resonates most with me I smile cuz there's quite there there's a few look they actually all resonate with me but I would say in it together love what you do and be awesome at it in in my mind people are out there most productive and engaged and enjoy their work the most when they're doing something that they're passionate about the mant group principle that resonates most for me is make good choices it's the fundamental one that sits almost underneath the other principles and allows us to use those principles to to make our

decisions and how we work make good choices I love this one because it's around treating people as an individual and trusting them to make a choice that's right for them we're close to, people now and we still are able to be principle based where our team genuinely make good decisions and use their principles to guide themselves manag group principles enable me to be more me at work we're breaking the mold in ensuring that we create an experience for people that is unique to them and really redefines what work means for

people

e

e

e

e e

[Music]

hey [Music]

hey [Music] he he [Music] [Music]

hey [Music] hey he [Music]

[Music] the world is changing fast in the new world of business The Fast and the agile will Thrive while the slow and the inflexible will be left behind to

compete you know that digital transformation is a must you must Leverage The Cloud you must turn to methodologies like devops you must innovate software at blazing fast speed it's a lot of musts thankfully this tricentis we accelerate digital transformation with the most comprehensive continuous testing platform from the best in-class test Automation and test data management to industry-leading Performance testing

system simulation and risk-based testing with the power of AI you can rapidly design automate and orchestrate virtually any type of test for your business to accelerate software delivery built for Enterprise scale it increases speed reduces costs and decreases risk so you can release software faster fund New Growth initiatives and be even more confident in your success and when that happens you bring new products the market faster speed Innovation and create more

fiercely loyal customers than you ever imagined it's no wonder that companies like McKesson Dell and Mercedes-Benz turn to tric centes to accelerate their digital transformation tricentis speed changes everything [Music]

it's always about the people yeah it's always about the people you work [Music] with I see the the company's commitment

to its employees the commitment to the community and that fille there's a ton of opportunity for support and empowerment here it makes me really proud to say that I work for Woodside I can see the contributions that they make to our community right from the Nippers all the way through to the community [Music]

associations you have that accessibility to leadership and I feel as if my view and my contributions are valid [Music] it's the the fact that it's a very reliable company and our customers value that reliability they value energy security they needed there's so many conveniences and Comforts that we enjoy today that we expect to be there tomorrow whether it's preparing a meal

charging your mobile phone woodside's ability to provide reliable forms of energy really allows us to help people access a better quality of [Music] life we're play a quite important PA in everyone's life and that's quite special and I'm quite proud I get the chance to work for a you know a gas plant like this [Music]

[Music] not sure if you've heard but at Bank

West we're on a mission to become Australia's favorite digital bank and to make banking seriously uncomplicated imagine digital banking so simple it makes sense to everyone no matter who they are or where they are that means [Music] me we know that uncomplicated doesn't just happen it's made not by a company or one individual

or even a bot it's made through our curiosity honesty and desire for change in other words it's made by us we know overcoming the favorite isn't easy but when you're making a difference for customers colleagues and the community when you feel like you're learning and growing and your ambition is matched well then it's all worth it the truth is Australians deserve easy useful banking experiences this might

sound like a big thing but for us it's everything it would take all of us working together to make self we're all proud of and that's why we need brilliant people like you Bank West made by

us maker X is an award-winning research and development Studio building digital products and high growth Ventures at the frontier of Technology our team of 40 experts including 10 former cxos brings a wealth of Industry insight and advanced expertise including AI web 3 and Venture building combining an extensive library of robust Technology Building Blocks an

entrepreneurial culture and a proprietary R&D framework we create groundbreaking products open new markets for corporations and optimize existing businesses for those brave enough to do what's never been done get in touch to discuss how maker X can drive your business forward for inventors passionate about Fearless exploration and bold experimentation makerx offers a vibrant inclusive environment where Innovation and continuous learning are core

principles here at makerx you'll work on transformative projects ranging from virtual machine compilers to nonprofit mental health apps join us to help invent the future and make a meaningful impact in the digital world makerx made for tomorrow hi I'm Melissa Herrera from data Stacks a company focused on helping developers turn their dreams into wild new realities through generative AI we are here to make it a 100 times easier

and faster for developers to build gen and rag or retrieval augmented Generation apps and experiences that you can quickly get into production data Stacks offers a One-Stop gen stack and data API that gives developers all the data to build a complete rag app for a faster and easier path to production we've also added Lang flow making it easy for developers to build one click rag at scale with an open-source visual rag framework with drag and drop data

flows and pre-built components for connecting to any API data source or database oneclick deployment and seamless integration with Astro DB for more relevant answers with ultra low latency giving you 20% higher relevance nine times more throughput and 74 times faster response time at 80% better TCO data STX delivers a rag first developer experience with first inclass Integrations with leading AI ecosystem Partners so we just work with developers

existing stacks of choice n [Music]

[Music] [Applause] [Music]

why is mandal group such a great place to be gee that's hard oh where do you start immediately what comes to mind is the people might sound like a cliche thing to to say but genuinely I believe that we've got the best people not only technically and and from a Consulting

perspective but just generally as people mantle group principle that resonates most with me I smile cuz there's quite there there's a few look actually resonate with me but I would say in it together love what you do and be awesome at it in in my mind people are at the most productive and engaged and enjoy their work the most when they're doing something that they're passionate about the mant group principle that resonates most for me is make good choices it's the fundamental one that sits almost underneath the

other principles and allows us to use those principles to to make our decisions and how we work make good choices I love this one because it's around treating people as an individual and Trust in them to make a choice that's right for them we're close to a thousand people now and we still are able to be principal based where our team genuinely make good decisions and use their principles to guide themselves M group principles enable me to be more me at work we're breaking the mold in ensuring that we create an experience for people that is unique to them and

really redefines what work means for people

e

e

e

e e

[Music]

[Music]

hey hey hey he [Music]

hey hey he [Music] hey he [Music]

[Music] the world is changing fast in the new world of business The Fast and the agile will Thrive while the slow and the

inflexible will be left behind to compete you know that digital transformation is a must you must Leverage The Cloud you must turn to methodologies like devops you must innovate software at blazing fast speed it's a lot of musts thankfully this tricentis we accelerate digital transformation with the most comprehensive continuous testing platform from the best in-class test Automation and test data management to

industry-leading Performance testing system simulation and risk-based testing with the power of AI you can rapidly design automate and orchest rate virtually any type of test for your business to accelerate software delivery built for Enterprise scale it increases speed reduces costs and decreases risk so you can release software faster fund New Growth initiatives and be even more confident in your success and when that happens you bring new products to the market

faster speed Innovation and create more fiercely loyal customers than you ever imagined it's no wonder that companies like McKesson Dell and Merc ladies bends turn to tricentis to accelerate their digital transformation tricentis speed changes everything [Music]

it's always about the people yeah it's always about the people you work [Music]

with I see the the company's commitment to its employees the commitment to the community and that F there's a ton of opportunity for support and artment here it makes me really proud to say that I work for Woodside I can see the contributions that they make to our community right from the Nippers all the way through to the community [Music]

associations you have that accessibility to leadership and I feel as if my view and my contributions are [Music] valid it's the the fact that is a very reliable company and our customers value that reliability they value energy security they need it there's so many conveniences and Comforts that we enjoy today that we expect to be there

tomorrow whether it's preparing a meal charging your mobile phone woodside's ability to provide reliable forms of energy really allows us to help people access a better quality of [Music] life we're play a quite important part in everyone's life and that's quite special and I'm quite proud I get the chance to work for a you know a gas plant like this

[Music] [Music]

not sure if you've heard but at Bank West we're on a mission to become Australia's favorite digital bank and to make banking seriously complicated imagine digital banking so simple it makes sense to everyone no matter who they are or where they are that means [Music] me we know that uncomplicated doesn't just happen it's

made not by a company or one individual or even a bot it's made through our curiosity honesty and desire for change in other words it's made by us we know coming the favorite isn't easy but when you're making a difference for customers colleagues and the community when you feel like you're learning and growing and your ambition is matched well then it's all worth it the truth is Australians deserve easy

useful banking experiences this mightn't sound like a big thing but for us it's everything it would take all of us working together to make something we're all proud of and that's why we need brilliant people like you Bank West made by

us maker X is an a award-winning research and development Studio building digital products and high growth Ventures at the frontier of Technology our team of 40 experts including 10 former cxos brings a wealth of Industry insight and advanced expertise including AI web 3 and Venture building combining an extensive library

of robust Technology Building Blocks and entrepreneurial culture and a proprietary R&D framework we create groundbreaking products open new markets for corporations and optimize existing businesses for those brave enough to do what's never been done get in touch to discuss how maker X can drive your business forward for inventors passionate about Fearless exploration and bold experimentation makerx offers a vibrant inclusive environment where

Innovation and continuous learning are core principles here at maker X you'll work on transformative projects ranging from virtual machine compilers to nonprofit mental health apps join us to help invent the future and make a meaningful impact in the digital world makerx made for tomorrow hi I'm Melissa Herrera from data stack a company focused on helping developers turn their dreams into wild new realities through generative AI we

are here to make it a 100 times easier and faster for developers to build gen and rag or retrieval augmented generation apps and experiences that you can quickly get into production data Stacks offers a One-Stop gen stack and data API that gives developers all the data to build a complete rag app for a faster and easier path to production we've also added Lang flow making it easy for developers to build oneclick rag at scale with an open-source visual rag

framework with drag and drop data flows and pre-built components for connecting to any API data source or database one click playment and seamless integration with astrab for more relevant answers with ultra low latency giving you 20% higher relevance nine times more throughput and 74 times faster response time at 80% better TCO data stack delivers a ragir developer experience with first-in-class Integrations with leading AI ecosystem

Partners so we just work with developers existing stacks of choice n [Music] [Music]

[Applause] [Music] [Music]

why is mandal group such a great place to be gee that's hard where do you start immediately what comes to mind is the people might sound like a cliche thing to to say but genuinely I believe that we've got the best people not only

technically and and from a Consulting perspective but just generally as people mantal group principle that resonates most with me I smile cuz there's quite there there's a few they actually all resonate with me but I would say in it together love what you do and be awesome at it in in my mind people are out there most productive and engaged and enjoy their work the most when they're doing something that they're passionate about the mantle group principle that resonates most for me is make good choices it's the

fundamental one that sits almost underneath the other principles and allows us to use those principles to to make our decisions and how we work make good choices I love this fun because it's around treating people as an individual and trusting them to make a choice that's right for them we're close to a thousand people now and we still are able to be principle based where our team genuinely make good decisions and use their principles to guide themselves manag group principles enable me to be more me at work we're breaking the mold in ensuring that we create an experience

for people that is unique to them and really redefines what work means for people

e

e

e

e e

[Music]

[Music]

hey hey [Music] [Music]

[Music] hey hey [Music] hey hey [Music]

[Music] the world is changing fast in the new world of business The Fast and the agile

will Thrive while the slow and the inflexible will be left behind to compete you know that digital transformation is a must you must Leverage The Cloud you must turn to methodologies like devops you must innovate software at blazing fast speed it's a lot of musts [Music]

[Music]

a [Music]

he [Music]

ah

n

[Music]

n [Music]

n [Music]

[Applause] [Music]

keep [Music]

n [Music]

oh [Music]

all right we still have a couple of

hands up over here okay and for the final question is put your hand down if you ever took more than than five minutes to set up your machine oh we locked we knocked out everyone that's good wait wait is it everyone anyone ah yeah that's that's everyone I suppose well all the people here that put their hands down we can all be good friends you know we've experienced the same pain and for this talk we've got a

solution for you but before that one quick words shout out to our sponsors this amazing talk won't be possible without their support especially software 1 for being this room sponsor so check their out check their Boot and thank you so much without further Ado let's start the talk my name is Prince and I'm a developer experienced engineer specialized in automated setups in atlan

and a current industry mentor of the coders for caes and hi I'm Dylan uh I have just finished my honors year doing software engineering at uwa uh and I'm the current technical lead of coders for causes so today is uh here is our agenda um today we'll be looking at introducing uh CFC and our problems uh our history with development environments then an introduction to Dev containers and finally our thoughts on the future of

development environments with a little fun showcase at the end so just as an as a disclaimer we have listed this talk as sort of an intermediate um talk but don't be scared this we explain everything in a way that does not require a lot of technical knowledge so don't don't feel afraid and with that let me tell you a bit about coders for causes so as a bit of context coders for causes is a non-for-profit student run club that builds software for Charities and non-for-profit organizations the way that we do this is

by mentoring University students and leading them to work with software projects directly with these organizations we run these projects during University breaks which happens during the winter break in July and the summer break from November to February and this is a bit of what we've done since starting in 2016 we've worked with many many clients we for example we've worked with food bank you might have seen them on the little advertisement screens outside there um to make it sort of an interactive web

game to promote healthy eating amongst kids and we've also worked with pets of older persons abbreviated as poops to make a volun here pet walking tracker to help the elderly get their pets walked so I've showed you what we do how does this all operate so during these project periods we run workshops that's them up there uh and some peer coding so as you can see we've got lots of puny children to set up and teach as a as a

puny man um everybody you can notice that everybody has their own laptop of varying cost and and iety um and it's also important to note that this is their personal laptop so they've got other things running on it like games um and most importantly we also find that people always run into setup issues it's always the windows person so with all these issues in mind let's let's um let's ask friends how do

we set up all these machines quickly and reliably well hang on a sec right there dude let's take a step back let's first try to understand the scope of the problem that we are dealing with every project period we had to set up 50 beginner developers with the different machines to be fit for development well picture this for a moment take half of the authen of the stock and imagine the other half you're going to be helping the other half in setting up their machine what problems do you think

you'll face in the coders for CES we experience three main difficulties first is hardware and Os inconsistency we have mac Windows Linux all of which we have to account for because volunteers bring their own laptop and second duration of the number of duration of setup for the number of toolings that we need it may take a while to set things up and third like what Dylan said this volunteers do other things with their laptop which

means that in one day it works and I sudden it doesn't and you saw just a while in our little game in the beginning this is a one problem in the software industry we've both been there you do something in one machine it works and then you do the same thing in another machine and it doesn't work now you've got a hard to reproduce problem causing a lot of frustrations to fellow Engineers like you and waste a lot of our time away from the things we need to do as a developer I just want to be able

to just work on things without any problems and as a company or manager I don't like my developers wasting time on things that are not making money so with that said Dylan can you take us back in the dark old ages of coders fores I sure can so we first start in 2016 our our Dark Ages where we are manually setting up local environments and this is what it looked like we download the package we go

through all these different Wizards hope you click the right buttons because if not you'd have to reinstall and do it again and then you have to consider that they different instructions for each OS and then there also different versions which you have to pick correctly and it takes a damn long time just take a look at this example this video takes seven minutes to explain and set up postgress this doesn't even include the time that it takes to pause the video follow what was done and then unpause a video to

continue cool yep cool then so let's add this all up to estimate all of our tooling this can take up to 80 minutes and this already assumes that you're familiar with the steps of setting it up and then let's not forget that we have to do this 50 times for 50 developers which totals up to 66 hours and this is still a conservative estimate because you have you have to assume that you haven't messed anything up because

if you did then now you have to retrace your steps and figure out what you did wrong or if any of what you did conflicted with what you previously have done and this can sometimes even take longer than the installations themselves so to summarize with this method of manually setting up things we have an setup inconsistency between OS and versions no easy restart from blank state it takes a hell of a long time to set up and there is a lack of isolation

of issues on the plus side though we have no additional compute overhead which will explain this we'll talk about about this a little bit later so that was our initial starting and overall it was a terrible experience help friends so in 2021 Docker was becoming very popular as a way to run applications here in F and we knew we should jump on the hype train for the benefit of the people who

haven't used Docker before Docker is a way to package all your dependencies and libraries in containers and then you run your service in those containers and the way how it works looks like this first you define a Docker file a Docker file is a step-by-step instruction on how to set up a machine and second you use this Docker file now to build a Docker image and then this image can then be used uh can be downloaded and also used to run a container that's a bit of mouthful very

technical so let's simplify that a bit hey imagine we're right here we want to build a Lego man first I have a blueprint the blueprint has the dimensions exactly how I can make this Lego man and this blueprint I can use that to create this mold this mold will then be well I can give it to my friend they can use that I can use it for mine as well I can use that image or that mold image to build the container that we have here now that I gave you a

simplified explanation let's see how we can build our own container in practice so here's uh a simple Docker file that I have over here and you'll notice at line one right here I'm specifying hey I want a I want to start with the python dependency and at line five line six I'm installing python libraries over here and at line 10 I'm running a web server in Python now let me just run a couple of commands over here first let's check things are

running no nothing running yet and then I do Docker build from this it's going to take all those Docker files it's going to go with python install the libraries and then was the the image has been built and then we can use that image to run a Docker container and this is how it works now this yeah and that's what building and running a container looks like but the thing is you don't always need to be building images some

people have published images that we can just use for example database such as post CR right that you see over here and if we want to use that to publish image I just need to run this one command and Bam I have a database none of the of long and manual installations over here and that's for running one container in reality you'll need to run multiple such as a back end and a database together a front end and for those cases you'll use a Docker compost

I won't go into too much detail but I'll show you what the experience looks like let's take it from a perspective of our volunteer developers at the left hand side over here we have a definition of what containers we want to run and from there at the bottom here over here I'll be running through Docker compost up and what this will do is that it will take this configuration it'll run all my containers it'll build it and run it and

you can see over here it's running now at the bottom right over here you notice that the containers are running top right that's what it looks like that's our front end it's running and let's take a look at our back end see how it works so right there we have our backend website now let's try for example just adding a new event right here let's put DVD per 2024 and for today all

right and save now let's bring out our database now that's saved let's take a look at what's save in the database okay we're connected now let's run a query okay and right there as you can see right there there's the record created and as you can see over here

it's genuinely a better experience we've been able to set up a number of tools in just one command and in just a couple of minutes backend front end database M mail server you name it however it's not perfect and here's a couple of examples where Docker fails if I want to run pyes for example in my IDE or terminal it will just say command not found and that's because the tools that are installed are only accessible inside

the container to make it work you need to go inside the container which means our developers our volunteer developers needs to run Docker PS get the container ID put it over at Docker exact ID over here that container ID right there and now they're inside the container and when they're inside they can access the tools right there now but you you can see already like I can see like the faces in the crowds like this is not easy this is not good to remember like

all the complexity right here that's another thing my developers needs to know and that's that's not really good and if that's not complicated enough there's also the problem of Docker networks wherein the problem between the host connecting your back end and the database there's also another annoying thing because your libraries are installed inside the container the editor cannot detect this libraries and this important because without

this the your code editor cannot suggest information with a code that you're trying to use so now we had to install the libraries inside and outside of the container Dylan I remember there was also another very annoying thing that Docker had what was it yeah and it's the biggest annoying big pain point of Docker and that it's the fact that it Hogs CPU in memory in CFC we found that there's always one or two people that have potato laptops which

can make the whole Docker setup a little undesirable and that's them there so to to summarize with Docker there is significantly less inconsistency in setup if you ever h a problem just restart the container from scratch uh if because the setup is done with with mostly one command if you want to set up 50 or hundred or thousands of machines it's easily scalable as some as some cons though

there is the complexity in developer workflows that we discussed such as going inside the container to access some tools and the docker Network issue that we didn't really go into much detail and there is also the duplicate install that's required in the host to allow certain tools such as like code order completion to work and of course Dockers overhead so as you can see over here dock eliminated a number of set of issues that we had however containers

added more complexities for us and our volunteers so a few years passed and I know this needs to be better hey friends I know you experimenting with um improving developer environments at alassian could you tell us a little about bit about that if it's not confidential oh Dylan I can't tell you that much only the ones we published so far okay but luckily enough this was it I was experimenting at that time on the concept of development containers

started by Microsoft and the reason why I was experimenting on this was because we wanted to simplify the experience of setting up for tens of thousands of our developers and I think some part of this can be useful to give back to the coders for caes and to the community Dev containers is a way to package Dev tools including libraries suppork packages but also ID extensions and more then you develop inside the

container so yeah it is darker but what makes it different is that that the container is customized for the IDE to connect and before we get to the very needy retails let's see how it works in action when our developer first open the repository this is their code editor and they are gritted with a prompt to reopen it in container and when this happens as you can see over here now the IDE or the

code editor is now pulling our development container this is going to take a couple of minutes two three minutes but we're going to be skipping uh that part so after downloading the image the dev container is automatically created and this is what our volunteers see so we have this it's booted up we have the coders forus development environment there and we're running a onetime setup in this setup we are

warming up the development environment ready for use by installing the libraries so you see over here we're installing the back end and then now over here the next one next section npm we're installing the libraries for the front end and then after that we're running our database and then performing the setups for the database tables so the back end is now set up in such a way that we can just start working especially we're adding test data such as the super user we

created and then the database we're checking that it's run now now that that one time setup is done let's take a look at over here so I run Docker PS showing me that the docker the database is running now let's take a look at the extensions so extensions it comes pre-installed and check out of the number of uh extensions that we package

over here an example of the extension I we like is error lens and you'll notice that later on when we start editing code now let's take a look at running our backend so I'm trying I'm running my Django application run server now it's running at Port 8,000 let's open that in browser and remember the time that I said we added the test data that we can

just use so over here we're logging in as that super user and Wasa we're right here now that's the back end let's take a look at the front end we're going the front and then yarn Dev now it's running at 43000 and here's our front end being

rendered running at Local Host 3000 and we have a button right here that tests its connectivity to its back end all right looking good so far now let's try editing a couple of things to the front end just to see that you know the development environment is working so this a length as you can see over here because of it you can immediately see where things are going wrong or where things could go right if you're

fixing up the warnings and then let's just edit this and then refresh our browser now hey right there now let me show you one of the most important tools of a developer a debugger some of the audience here will agree with me that setting up debuggers

are tedious so instead of setting them up they just clutter the codebase with console logs and print statement so here in our setup it's preconfigured you don't have to mess around with those configuration it just it just works and that's what we want as developers so now I have this back end running again but this time let's take a look at a particular endpoint the end point that we started with the pong end point so let's navigate to API

slash API API health check this over here it's loing API H check and then that's what the end point that the front end is communicating with so now let's take a look at that particular line so this is our code let's just put

a a variable over here and we'll modify that during run time a is equal to one and let's just print it out so that we're not getting bug in terms of this lters now right let's put a breakpoint right there now let's open this in our browser but this time when we open it the application will stop so that we could investigate how things are working right there there you go it stop at line 11 and you notice that because this debugger mode we're able to see the context of the values of their variables

the state at the left hand side and then we could use the debug console to either change the state like for example over here a equal to two and we're changing the values right there so yeah that's what the general experience looks like from opening the code editor up to developing with Dev containers now let's go through the

features that you've seen in that demo firstly is the shell the terminal there's a number of cool features right here you have the the relevant information get the current folder in the current user that you're running in and you also get the auto complete based on your command history we've set this up as a default for everyone in CFC for a bit of context this is what it looks like if you have not set up the shell very Bare Bones and sometimes hard to navigate and secondly is the RP are

pre-installed recommended set of extensions that is help for developer productivity uh yes like vs code bets over here for emotional support and as we mentioned earlier we had a onetime setup and this is one of my favorite capabilities that Dev containers provide us on create of a Deb container we can set up the standard environment variables we can then

pre-install the libraries that we using and even set up test data that's just one of the life cycle hooks you can get very creative with there's more like cycle Hooks and there are some of them with all different points of time where they can be executed this are the functional benefits that we get with Dev containers hey Dylan do you know any other de containers benefits that we get I certainly do uh there are a number of

educational benefits when it comes to using Dev containers so imagine yourself starting as a new developer what would you have wished that you would have been exposed to earlier on do you have any in mind prins uh yes for me I wish I would have learned much more about the shell in my experience sometimes the shell is what allows you to crawl through millions of Lins of documents in just a couple of seconds what about you Dylan for me well I'm about the Aesthetics so

how level 10K vs code themes I'm all about that so I wish I had a pretty shell and all those ni aut complete extensions so when you learn these tools it's sort of like a cascading effect when you use the tool it makes you more productive and thus you enjoy developing more and then when you enjoy developing you become more motivated to learn and more excited to build and that's kind of what we want our volunteers to feel and this also has a number of

practical benefits When We Run The workshops we use a single Dev container for everything that we do and it makes it easier for us to help volunteers if everyone is on a consistent uh pre-configured environment and it's also easier for the volunteers themselves to transfer their knowledge from the workshops to the projects so we talked about the the benefits uh let's take a little bit of a deep dive into how CFC has their Dev container set up so as a high level we

have sort of two phases um with phase one being building the dev container and then phase two consuming the dev container so phase one is where we build our dock image and publish it to a container registry this is done in one single reper that we call automated setups very inventive name um then we in phase two we specify the dev image Dev container image that we want to use in the consumer repo then we can get the

then the laptop will sort of pull the image down uh from the the published container registry then we will create L Dev container to use so this is the docker file that we that gets turned into an image um this is sort of a simplified version of what we have you can see we first um specifile base image then dependencies install the dependencies copy any uh additional shell configuration over um and other stuff in our automated setups

repo then we can specify our Docker file in the dev container. Json config uh it's a bit small but this is where we can install our features and extensions like node Python and all our other vs code extensions uh there we go skipped a bit okay um now when it comes to that's not it cool then we get to publish it to The Container registry this is just a cicd

pipeline on GitHub and then after publishing our Dev contain image we can then specify in the consumer repo's own Dev container config um and it looks a bit like this can specify image up there we can pick the extra features notably so we can exp specific um and this is also where our life cycle hooks are most useful uh to automate these repo specific setups and as you might have seen earlier we can use one of our lifecyle

hooks to run the onetime setup that you saw in the demo so back to the pros cons let's summarize this so we get all the benefits of Docker whilst abstracting away from docker's complexity we also get the auto configuration of the shell and extensions as well as the educational and practical benefits uh with the dev containers it's also notable that we if you change the version that we specify in the repository it will also update for everyone's environment and this is

useful for things like security and stuff like that and also this solves the need to install duplicate tools on the host so now for the cons if you're using Docker for production you'll still need to use another Docker file and finally we still have the compute overhead of Docker in terms of overall impact though back to that problem statement we're saving over 80 plus hours per project period by eliminating the works on my machine frustration and we're also

increasing developer productivity with these new fancy tools so we found found that these Dev containers are pretty cool pretty amazing and the volunteers were happy as someone working as an industry leader what's next on the plate friends that's a good question D well I'm glad you asked D because let's take a step back first before we dig deeper at that one the problem that we're facing with Ducker is that there's a lot of this Hardware limitations and Ducker just

consumes a lot of CPU and memory and it's a bit memory hungry so in the industry we hit Hardware limitations too Docker and other tools uses that Computing resource and some sometimes with big softwares it's not even possible to run it in one computer sometimes for you in CFC as a not for-profit fueled by student developers CFC just lacks the resource you can't just simply fund other people's loplop so you always hit that problem but the way how I see it is we are moving into a

future where there will be such thing as remote environments and imagine what if we move our compute into the cloud an example of that is GitHub code spaces that will mean that we're no longer limited by the hardware that we have we can use the power of the cloud to scale and pick a particular machine specification that we need let's say you need a machine that has 32 gigs of RAM

it's as easy as a couple of clicks hooray finally we can download more RAM and with this Tech it's actually built on top of Dev containers if you look at for all other alternatives to code spaces they all follow the container approach to summarize with remote environments we get the benefits of TB containers we're no longer limited by the compute resource and because it's in

the because it's in the cloud all the workloads that are network heavy such as download package installing this will be become way faster but also because it's in the cloud there's a latency between the computer the physical computer that you're using talking to the computer that's actually doing all these compute builds in compilations and the other thing to consider when you're moving to remote environments is costs for CFC

it's mostly free with GitHub education you get like this free hours that you could just use but for Enterprises it's something that needs to be considered and that's why in this next demo we're going to be doing something unconventional and I'll be proving to you that it's possible to run an entire development setup from this phone all right let's set things up give us a few moments

uh it's to be noted that not every phone can plug their can plug into a display unfortunately mine cannot but um for the purpos of this demo we can do it ready yes so we're right here in uh GitHub uh in the repository that we're going to be running so now let's open

our GitHub code spaces I've already uh booted it up it actually doesn't take a long time to boot it up so now let's connect to it so there we go we're setting up the code space we're booting up setting up remote connection yep we're a bit constrained on on space but thank you Microsoft for making vs code mobile friendly yeah real estate is a bit of a problem right now yeah but um if you have a phone that can connect to an external display this can work pretty

well all I think we're connected let me just open a terminal right here okay right there we're showing up now let's run Docker PS see what's running right here so we have a database running here right now and let's check out our front end code so I Oh wrong wrong button different key press always something has to go wrong in the live demo

yeah screen all right now I want to open a front end file over here right there let's just try testing out some things we wanted to test that our extensions are working so give it a they are working latency there we oh there we go so that was the extension that I was talking about so similar to the previous D that you've seen so it is working in terms of

extension right there now let's take a look at running our front end first uh there we go this is a nice autocomplete right here okay we're running at Port 3,000 right there now open it in the browser there how it's folded the port for us yep it's forwarding the port so it's a secure connection from our from the development environment to our computer now right there we go we have hello world right here let's change that

text right there hello DDD there you go save that now let's go back to that browser and bear there you go so front end is running yeah now the wall the harder part the back end so let's go to CD server and then I think okay there we go I didn't remember

the command but yep nice Auto defaults oh it's running at P 8,000 right here and now let's connect all I remember there's this uh page called the dash admin this is the login page right there and then there's yeah let me just show that really quick yeah there you go and then there's another endpoint I believe there's the API health check thing right there and it's

running so yeah that's our live demo let me just uh get this loading up to the presentation yeah good so as you've seen in the demo this is the bright future of automated setup that this will allow us to save a lot of time from boring manual setups so

we can focus on the things that will generate value but also a future where anyone even the less fortunate that cannot afford computers that are really beefy for this Tas have the means to just get a remote computer for the free hours of kiub education and start learning how to code and that's it folks thanks for coming and hope you learned

something thank you thank you Dylan and uh th for great talk uh let's uh have some question from the floor just raise your hand if you all got a question hello do you think there is value in students going through that pain uh of setting up their developer environment that's a very good question I do think to a certain degree you know they have to feel that pain to

understand why all of these Investments are important and that's why in University you know like I really like the fact that University teaches you I guess you just manually install certain things but when you go to the industry or at least for the coders per cess alongside of University you get a different view as to how things work in the real world you know you can't just spend hours of time oh getting this uh wizard click click that to get it working you just want to get get to the

medy part of like generating value like code for the product yeah for CFC standpoint there's sort of like a balance between um we have a limited amount of time during these projects to develop things um so we need to minimize this setup time in order for them to to learn the most um yeah than for the question thanks any other

question hey friends how you going yeah good man yeah so I think one of the the biggest down uh Falls is the latency do you know if there's any tricks to kind of fix that well I think there's a a couple things to keep in mind is that uh for the first one is uh when you're developing uh local developing and connecting to the remote environment some of some parts are actually done in

Bates in terms of syncing so when you're typing it's not like the co it's not like the Char characters needs to go over there first you know like you're typing it exists in your local first then syncs remotely so it's kind of like a much more optimize Network um transfer for that but you do make a good point it's something to keep in mind especially if you're working let's say in a mindsight this solution probably wouldn't work because the network may not be that

reliable cool thanks this will look good on your Apex thank you man um hi uh my question is um are Dev containers limited or do they only work with Docker or can they also work with other alternative container run times like podman cun that sort of stuff and I guess a follow-up question is in the demos we're using vs code how the

support or integration with other idees like stuff from jet brains for example yep that's it's a very good question so I'll answer the second one first so in the second one Dev containers is a specification it's not just specific to vs code like there's a particular documentation over here saying how to create an IDE that will integrate to this one and because of the fact that it's a specification so BS code uh BS code integrated to that specification uh intellig jet brains actually also

integrates to that one I believe there's also a Dev container CLI wherein like for people that let's say they don't they want to use mioim or Vim like much more in that terminal space they could have it installed right there and they just boot up de containers from the command line instead of from an IDE I think there's a first question um the docker runtime stuff I actually have not tried it so I'm not sure I'm not sure what you like mean exactly by

different Docker run times like personally I'm using orb stack on my my laptop it works fine anything that can use like the the spe like that can interface with Docker use the same Docker commands build run whatever um which should work fine at the end of the day it's just a container as long as you can run a container it should be good yeah I guess it's more of the container engine itself so not using Docker but I'm using podman or cun or that other sort of things yeah I'm not I

have it to be honest I haven't looked at um I haven't used pod man before so I can't uh I can't say how much have you used it before I hav SP okay all right one more question unless if there's a l call yeah I have the microphone um so I understand this is mostly for an educational like teaching people to code very quickly environment but um do you have any processes in place for say you're happy with what you've developed

inside the container uh stripping out the dev container and all of um the packages installed on there and like the ID and stuff and then just running the container with the uh service that you've developed in there so like uh like a bare non-customized version of a Dev container oh more like how how quickly do you do or do you have any processes for quickly um taking like what you've made inside of a Dev container and then

just running that raw without all the development tools okay so for like in production environments yes yeah yeah do you about that so the only the only thing is like what we have to focus is the development container the development container is specifically used for development this is the reason why in one of the cons slide that we have is that if you want to run a particular container in production then you'll need to have separate Docker container so you have a Docker container that will that is for

your development environment and actually in that setup if you notice uh in the dev container we're running another Ducker container inside it it's the database so you could actually follow that same pattern as well like if you just want to test your production Docker container you could run your production Docker container inside the dev container hi um I got a question around

building the docker container from an existing project versus a branding project from scratch because with a new oh sorry with a all existing project would there be you might encounter stuff that you have to untangle previous configured for various reasons would you have any guidelines or advice for someone to start out on building their own Dev container

to like have a Target to reach yep um that's a that's a good question so I think there's different layers to this one is currently running at a Docker compost kind of style and the next one is like being able to run it in the dev container but I think the easiest way to uh look at this is like take a look at it from your laptop what tools do you actually need to be able to start development on this one so now Focus developing your Dev container on

that tools now you run Docker compost inside that and from there you have to decide whether it's worth put uh it's worth if it's worth removing that Docker compos and putting all the uh the tools that you're installing or at least without the containers um if that's work it the other cases like for example that you've seen right here is that we're running python manage.py run server directly in the dev container and we can do that

because that's an isolated environment but there's also another option is you could run a do container yeah that's D Jango inside it as well but like the benefits you saw over here is like we could just directly inject our stuff in the python runtime to do a python debugger for example and that's not exactly easy when you have like for example do compost I just wanted to add a little commentary to the conversation going on

a little bit earlier um my two sense is that um what so the context the question was productionizing this Dev container um image uh my advice would be not to productionize your Dev container image but a Docker image is actually usually a layer of many images kind of stacked on top of each other where you add this bit and that bit and that bit so you just take a point in your stack where you go this is the minimum of what I need to

run this in production and you say this is my production container and then you just add your Dev container coner dependencies layer on top of that and now you've got your Dev container and you've got your uh production container and they're very very similar they're as close to similar as you could hope to get them which is what you want yep and there you go yes that's a that's a very good point all right thank you for a great talk let's give a big Applause for the two

gentlemen and also uh this is is quite a great opportunity to help the coder for good cause isn't it yes good causes yeah good causes they're doing great stuff there so feel free to join on board and help them financially or even as a volunteer thank you [Applause]

[Music]

[Music]

[Music]

[Music] [Music] [Music]

[Music] [Music] [Music]

[Music]

[Music] [Music]

[Music]

[Music]

[Music]

[Music]

[Music]

[Music] [Music]

[Music]

[Music]

a

[Music]

n [Music]

huh

[Music]

[Music]

n

[Music]

hello how are you

feeling tired oh you've got in the right place now because this guy will pretty much put some energy on his

talk so don't worry I've got you C covered anyway thank you for coming again this place if you don't know who I am my name is Michelle niku I'm a research fellow at the University of Western Australia I don't know if I'm a developer or a data scientist I don't know maybe a d Enthusiast so that's why I'm standing in front of you as the MC well uh we have Troy who is going to be

the last one for from this venue he's a s engineer troll has 20 plus years software engineering experience across a range of Industry including games education Healthcare and resources he also maintain and contribute to several open- source project and has mentored a at kod Dojo tooy is interested in what makes software great not just for the user but also for the team maintaining

it plus I will add for the environment as well please let's give a round of applause for [Applause] Troy on my clicker just want to start by thanking our sponsors um it's great that it's so

cheap to come along um these free ticket well almost free $60 tickets but it's quite expensive to put the thing on so that's where we really need to say a big thanks to our sponsors um and this room has its own sponsor software one so thank you software one Tony invented the N reference to make programming better and 40 years later he said it was a billion dollar mistake we'll look at what he meant in

this talk but mainly I want to ask another question has the software indry made other billion dollar mistakes are we making new ones right now this talk is an exploration of those questions and I'm going to propose some additional members to the billion dollar mistake Club but to start with I want to get us in the right frame of mind for thinking about our mistakes as an industry I thought I'd do that by recalling some headlines from within living memory in 1996 the Aran 5 rocket 501

exploded destroying itself and its payload of painstakingly developed research satellites totaling half a billion in losses the cause was a simple numeric overflow in the guidance code for a while in 2015 anyone could crash an iPhone just by sending it a message due to a bug in the character handling code in 1985 patients were injured in some cases fatally after being hit with huge doses of radiation from a machine called the theak 25 the machine software

had a race condition that wasn't detected in tests but it had occurred in real world use the crypto industry has seen not only huge frauds like FTX but also huge investor losses caused by software bugs in code that runs on blockchains such as ethereum just this year in 2024 a bad crowd strike update caused numerous flights to be grounded and brought Banks and retailers to a standstill with total losses estimated over5 billion because

of a line of code that read past the end of an array the organization cisq released a report in 2022 estimating that overall software problems cost over2 trillion Us in losses per year and that's just in the us alone so it seems the message so far is our software is bad and we should feel bad but here's something that will hopefully cheer you up we're not the only ones making billion dollar mistakes

when you're feeling bad about failures in your life what's more comforting than looking at the even bigger failures of your friends in neighbors so let's just go there briefly just for fun and hopefully that'll get us in the mood to spend the rest of this talk casting a critical eye at our own mistakes there's lots of good examples in the finance industry I'll just mention one in 2022 a city group Trader put a big number in the wrong field and pressed enter European markets went into a selling frenzy the total losses which included a massive fine wiped out two

years of the trading desk Revenue but at least no one was hurt right which brings us to this guy Thomas Midgley invented leaded fuel to make engines run smoother which then was pumped into the air by cars for decades lead is a potent neurotoxin one study found it knocked six IQ points off people born in the 1960s and 70s mitchley then went on to develop chlorocarbons commonly called cfc's for use as refrigerants which caused the hole in the ozone layer that's one

reason why 2third of Australians will be diagnosed with cancer skin cancer in our lifetime one historian said that Thomas Midgley Jr has had more adverse impact on the atmosphere than any other single organism in Earth's history so in the grand scheme of things maybe a billion dollar mistake here or there isn't so unexpected but let's get back to our original question are there other billion dollar mistakes to answer this let's do some napkin maths the infographic that I'm showing you here

it's from from that 2022 uh report that I mentioned it's called the cpqs report which stands for the cost of poor quality software and you can see it's a bit messy um but it boils down to this we still don't know how to create software without also creating lots of extra costs in the forms of Legacy Tech debt hacks defects in theory at least these costs are avoidable in the sense that if we figure out how to create good quality software we wouldn't have to pay these

costs I don't want to be unrealistic I don't think we'll ever have perfect software and some of these costs are really human costs like social engineering or just plain bad management which isn't unique to to the software industry but let's just say that both this report the cpqs and Tony H's billion dollar mistake let's just say that they got their numbers right to within an order of magnitude then a simplified view would look something like this seeing it this way it's hard to

avoid the conclusion that there must be other billion dollar mistakes out there and probably quite a lot of them but before we start looking for candidates there's just one more thing I want to clarify those bad headlines in my introduction they looked at mistakes at the point where things went obviously wrong but in this talk I'm really only interested in the driving factors that cause these instances of mistakes as an analogy imagine a product recall for a car where there had been a lot of incidents where a wheel had fallen off the car maker points out that in all

these incidents the driver had hit a pothole or or the curb so the problem is the driver not the car but the regulator states that a car should be built to deal with known hazards like that and it holds that holds the car maker liable for the cost of all the accidents and the redesign of the car so the poor car design was the driving Factor pardon the pun behind all these costly accidents in software we can also find driving factors behind all our costly accidents and the one I'm

particularly interested in which is unique to our industry is programming language design Tony hore the inventor of null references pointed to this in the same talk where he said null was a mistake he said a programming language designer should be responsible for the mistakes made by programmers using the language so programming language design would be a lot of my focus in this talk another apt phrase for this would be prevention is better than cure so here's an overview of the

mistakes I'm going to explore today if this was a club the billion dollar mistake Club the first three would be retired members things that we either fixed or we just avoided all together CU we caught them early then I've got a couple of what I'm what I'm calling rejected members of the club um we'll get to that and then we've got some current members of the club and then we'll look at potentially new applicants so let's start with n references the name sake of this talk and here's sir Tony hore in 2009 giving

that talk and looking quite ashamed of himself um I think he's actually just looking any speaker notes but that's I thought a good screen grab for this talk so what did he do exactly and why so let's first of all start with looking at the case forel references let's just say that we've got the need for some sort of data structure with nodes which point to another node and and we need to be able to create Cycles so in my simple example at the bottom here we want to have node a pointing to B B to C C back to a um but in order to construct a node

we need to give it a a reference to the next node so when we try try to construct a we we don't we don't have a reference to B yet so we can't construct that but we could maybe construct B first so that we can construct a but then we need c first and C needs a so one way or the other there's no way to construct this with valid references that's something that Tony Hall was trying to solve and he realized well what if we make it that null can be a valid value wherever a reference could appear what we could do then is we could assign null to a reference to to mean

this is missing or it's not initialized yet and that's what we've done here in this example circled now the problem goes away because we can just construct our nodes a b and c and then we can actually assign the references from each to the next and the problem solved great so what's the billion dollar mistake well to illustrate that here's some code from a fictional country's National Defense System this was a huge multi-team effort here's some of Team A's code it's pretty straightforward uh if the country's under attack we obtain

a Target and we launch nukes at it unfortunately the country was was attacked and rather than defending the country this program crashed with a n reference error and half the country was destroyed an investigation was conducted presumably by the other half uh to find out what went wrong team B shared their radar tracking routines and team C shared the launch routines and the source of the error was now pretty obvious obtain Target the function there since version two could now return a null Target because now it handles

multiple targets so you know if you're going to ask for a single Target there needs to be some sort of Sentinel value to tell you oh no you need to go and call the other the other thing so they changed that to return null to support this new version but the launch code of teamc assumed that the target was always defined that assumption worked fine in version one but it now crashed the system so Team B they refused to take the blame because technically returning null was always allowed it's always a valid value for references so it wasn't a breaking change and they clearly documented it they said Team C should

have been checking for null all along but team C refused to take the blame they standing by the obvious logic that it's pointless to launch at a null Target and furthermore all their unit tests with their mock targets were all passing at the time of the incident the conclusion of the final report was that the programming language itself was to blame because there was no way to capture critical requirements like the possibility of null in the programming code itself in a way that could be automatically checked this critical information was conveyed only in comments and getting it right was a manual job left up to human attention

to sum up no references basically there's a problem a solution consequences and Alternatives and I'm going to have a slide like this for all of these so in this case he was just trying to improve the safety and ergonomics of references null solved a problem and it was easy to implement but the consequences which became apparent decades later was that the ability for null to be a valid value everywhere just in massively increased the space of possible programming programs that were acceptable to the compiler um and it

made bad outcomes easier to create and there are modern Alternatives so we don't really have to deal with this and this is not this is really a talk about problems not Solutions but you know just so you know there's things like non-nullable references option types Union types and static control flow analysis so that's null references I'm going to move on to another retired member of the billion dollar mistake club now and that's the go-to statement goto would certainly have

become a billion dollar mistake if it hadn't been apprehended early most of us probably don't even see go-to statements anymore but we've probably heard of this very famous paper by edar dyra called go-to conate goto statement considered harmful which is the the progator of all the the subsequent considered harmful papers this is the entire paper on this slide it's not meant to be readable I just wanted to show you how short it was considering the enormous changes it it brought about in how we program today his essential argument which I've highlighted here was about human

fallibility and how better programming language Concepts could prevent whole classes of common mistakes he says here in these highlighted bits for a number of years I've been familiar with the observation that the quality of programmers is a decreasing function of the density of goto statements in the programs they produce our intellectual powers are rather geared to master static relations and our powers to visualize processes evolving in time are relatively poorly developed for that reason we should do our utmost to shorten the conceptual gap

between the static program and the dynamic process to make the correspondence between the program spread out in text space and the process spread out in time as trivial as possible so here's an example and I got chat gbt to generate uh basically two examples of the bubble sort algorithm they're both in a dialect of basic so the one on the left is using line numbers on every line and go-to statements and the goto statement is pretty pretty easy to implement right every line is numbered if you come

across a go through statement that says you know go through line 70 then just jumps to line 70 and continues from there but I've also annotated that with some arrows showing you where the control flow goes and they it can get quite confusing if you basically want to understand what the loops like how what are the loops in the bubble saw what are the conditions where are the guards where where's this you've really got to run this in your head it's quite um easy to get wrong and in fact chat GPT got it wrong there's a bug in that code and I thought you know what I'm going to leave that bug in there um because it shows

how easy it is to just get these things wrong and well done if you noticed on line 60 that there's actually an if statement there um that effectively goes to line 70 in both the true and the false condition what's interesting is that you could not even express that in structured programming which is the version on the right which is what um dyra was proposing that we switched to you couldn't create an an if block that was both the you know the true and the false block so um goto works it's very very

powerful in fact uh but but you know structured programming on the right is so ubiquitous now that that it's pretty much all we know we didn't even call it structured programming we just call it programming the summary for goto is that programs need to be able to express control flow and goto solved all this in one statement and it was easy to implement but consequences this is a familiar one its flexibility allowed a huge space of invalid programs that

couldn't easily be detected because they're semantically valid they're valid programs they're just not the program you wanted they're one with some bugs in them um there's Alternatives now that's why we don't generally even he of goto anymore and the one is structured programming you just have these these blocks which represent loops and conditionals um so we've lost power we've lost power to express things using go-to but actually I don't think most of us miss that power all right this one's going to be

an interesting a bit obscure this one like go to it's now mostly forgotten Dynamic scope but this was the way variable lookup worked in early versions of lisp and in a few other places and I think if this had have taken off it would have become another member of the billion dollar Club of mistakes so again I've got some code on the left and the right and on the left this is just plain JavaScript using static scope which whether you realize it or not is basically how we always scope everything in pretty much all languages now um what you'll notice is that there's three

different Declarations of a variable called a one is at the toping Global scope and there's one declared in the function G and there's one declared in the main function they have different values 1 two and three but whenever you call the function so at the bottom there when we call F what does it print prints one when you call G what does it print it prints one so most of you should know why that is if you look at the function f you're trying to work out what does that a refer to in the function f and the rule is quite simple you just look at the text of the code you don't need to run this in your head you just look at the text is there a local Declaration

of a in that function f no there's not so you go to the out the outer declaration the the the lexical scope outside which is the global it's the a equals one so that's what you print and it doesn't matter if you call it from g g calls F but you just do the same logic is there a local local a and F no there isn't is there one in the global scope yes there is so we print one now on the right this is a demo of dynamic scope and I've just used JavaScript syntax but the the concept applies here at the bottom if you called

f from Main it would print three and if you called G which calls F you now get two printed so what's happening here well we start the same if you if you're looking at the print statement in function f we start the same way we say is there a local F local variable a no there isn't so is there a local a in my caller well where will we call from we will called from Main so yeah there is and the value is three so it prints

three but if you call F via G when F says is there an A in my caller now the caller is a different function it's G and the and the value of a is two so it prints to um this is really really confusing because basically the behavior of a function depends on the core path that the program took to get to it it'll behave differently depending on how so the question is why would you ever even want this why was this ever a good idea well it was probably you know we didn't

know how to do these things at first so we experimented with different things but also here's a case in JavaScript let's just say that we had some application that was signing up users you pass in a registration form and it like just check that they're the right age and if so we'll we'll store their credentials and we'll send them an email but the developer is like there's a lot of repetition of this variable registration form it's too long like can we shorten this it's like ah remember there's this this thing in JavaScript called the with keyword so you can still do this in all browsers today it's just

in non-strict mode which is quite hard to access um um but yeah you can basically pass in your object to the width statement and inside that block any property that is inside that object is brought into the local scope inside that width statement so now we can just say age we just refer to age um we can refer to the username the password and the email and they all come from that object and that is dynamic scope it's it's the same thing um but it seems quite useful that could be a good case for it why did we get rid of the WID

statement well because it's extremely remember that it's um hard to reason about if a hacker was able to get hold of this registration form they could put a store creds property on that and now you're not calling the store creds that you think you're calling you're calling their store creds because when we work out what does store creds refer to in Dynamic scope here it goes ah there's one on your object so that's the one that I called the behavior of that function totally depends on Dynamic factors that you can't predict from just

looking at the code that's really dangerous and that's why has B basically been banished so in summary again code there there's a problem right it needs access to non nonlocal variables there was these couple of solutions that people came up with they seemed kind of both good at the time um but it quickly became noted that Dynamic scope is complicated and confusing because functions behave differently depending on who calls them um and generally the alternative is we just got rid of dynamic scope and we only use lexible scope with a couple of notable

exceptions like um C and C C++ macros all right let's take a break from looking at code I want to look now at a couple of historical Computing ideas that were not pursued all we can ask here is what if we'll never know what mistakes we would have made on these paths but I think they're just fascinating historical Curiosities if nothing else you might pick up a bit of obscure trivia you can wow people with at the

Afterparty the first one is is the Russian computer called sattin from 1958 pictured here something we take for granted today is that computers represent everything using binary numbers that wasn't always so this computer did not have bits it had trits instead of on and off or one and zero each trit had three possible States zero one and minus one sattin represented everything in balanced turnery notation multiples of three came

up elsewhere as well the compu had 81 words of memory and each word held 18 trits with hindsight we might say that this was a costly waste since pretty much everything is binary today and alternative systems lost so just a waste of time and money but that was not obvious at the time I discovered settin while reading Donald n's section on number systems in the art of computer programming he calls balance turnery notation perhaps the prettiest number system of all and shows how many

numerical operations are simpler with trits than they are with bits n even suggests that Turner Computing could make a comeback someday because of its nice mathematical properties when in his words the flip-flop will be replaced by the flip flap flop if that happens then our descendants might look at at the few decades when binary computers rule the Earth as the wasteful aberration but I think don't hold your breath on that the other historical Marvel I want

to mention here is the analytical engine designed by Charles babage in the 1830s at the time numerical calculations in engineering Banking and other fields relied on tables of numbers that often had errors in them because they were made by humans that theme again babage had an idea to make these calculations 100% reliable by automating the whole process he wanted to build a calculating machine but the kind of calculations varied so he wanted to build a general purpose calculating machine that could be programmed for any

task does that sound like a familiar he did design this machine and it was a completely mechanical design it was never built in his lifetime but um a a partial replica was built and part of that is shown here and that it was built in the 1990s and it weighed about 5 tons I don't think we truly appreciate how astounding bab's inventiveness was here are some of the concepts all familiar to us today that he figured out

100 years before the first electronic computerss his design was cheing complete in an age when we didn't even know what that meant there might even be a billion dollar mistake here and that is that his ideas were basically ignored and forgotten and they had to be reinvented by other people 100 years later what if his ideas had gotten more attention and traction back in the 19th century who knows maybe living standards might have shot up a lot earlier maybe some of these retro futuristic pictures would have been realized we might even have

flying cars by now all right let's come back to Modern Times And now we've moved on to current members of the billion dollar mistake Club the first one I want to talk about is concurrency many years ago as a student I had a go at writing a guey application to draw fractals um like the one at the the bottom in the center here the results were not what I expected the renderer actually worked great but while it was

rendering everything else on the computer ground to a hole the music stopped playing the mouse pointer even stopped moving it took me a while to work out what I was doing wrong to cut a long story short I was programming this on Windows 3.1 which was single-threaded and used Cooperative threading when you write an application for Windows 3.1 your code is expected to explicitly yield control back to the operating system after very short intervals of computation otherwise nothing else gets a chance to run one badly written program like my fractal renderer would

grind the entire system to a Hulk modern operating systems use a different model called preemptive threading each application can have one or more threads to run its code and the operating system schedules these threads onto the available CPU cores this means truly parallel Computing on multicore machines on a typical running computer there might be hundreds of threads but only several CPU cores so the operating system gives each one a short slice of execution time and then puts it asleep

and resumes it later when the other threads have had their slice of of execution time this all happens very fast and it gives the appearance of everything running smoothly all at once even in the presence of CPU hogging applications like my fractal renderer this model of preemptive threading obviously had to be supported in at least some programming languages especially systems languages like C and C++ but it also became the default concurrency model for higher level languages like Java and C when they were

released some years later a cool new thing arrived called nodejs it proudly advertised the efficiency benefits of its single threaded concurrency model I was interested in trying out trying it out but the more I looked at it the more I realized it was basically the same Cooperative threading approach used in Windows 3.1 concurrency with a single thread and if something Hogs that thread everything just stops but what was interesting were the performance benchmarks they often showed nodes single thread outperforming

Enterprise scale multi red Solutions like spring on the jvm and asp.net and here's an example of a benchmark from 2013 Tech EMP power benchmarks it turns out that using preemptive threading in an application can be pretty inefficient operating system threads are heavy resources and switching between them has CPU costs they also spend a lot of their time blocked doing nothing today Windows 3.1 style Cooperative concurrency seems to be in

all the cool languages and Frameworks you might go under various names like like co-routines green threads as synchronous apis and non-blocking IO but they're typically all based on Cooperative rather than preemptive concurrency spring itself now proudly advertises its asynchronous non-blocking architecture what was a mistake for operating system concurrency now looks like a great idea for application concurrency and the inverse is true as well what was great for the operating

system doesn't look so good inside an application but that's not even the biggest problem with preemptive threading in applications the much costlier problem has been the killer concurrency combo on their own they might be okay but when you put these together well the threading model that c and C++ Java and c and others have adopted has these two parts multiple threads in the same application can act can execute simultaneously and all those threads

have access to the same application to see why this is an issue let's look at an example here's two stack implementations the one on the left runs on the jvm and it has bugs that could cause a major issue in production the one on the right runs in node and it's fine there's nothing different in the code aside from minor syntactic differences that's kind of the problem suppose a thread running the Java code on the left gets suspended at

one of these orange arrows that I've highlighted it's BAS basically between two two critical parts of doing a push or a pop at that point the stack is actually in an inconsistent State despite being fully encapsulated in class so for example we're in the middle of this push method what we've done so far is that we've put our um our item into the array but we haven't changed the size of the array so what if a second thread let's just say that that first thread is put to sleep now and the second thread wakes up and it and it

also tries to push something onto this stack well it's going to become invalid because it's going to push its own thing and increment then the other one's going to wake up and it's going to increment and there's going to be one lost update and one random piece of data that shouldn't be there um so depending on what the application is doing the results could vary from a minor bug to a major Corruption of data this is called a race condition on the right the one that runs in node there's no possibility for the stack to be observed in an inconsistent State because this code

executes cooperatively running until it Ys control since it doesn't yield control in the middle of the push or the pop operation nothing can ever observe an inconsistent State going back to the one on the left there's fixers for this to fix that issue in the preemptively threaded environment the programmer would need to consider where race conditions can occur in their code and this is already a big ask it requires a lot of careful analysis race conditions are often only detected when they show up as a bug even then these bugs are notoriously hard to

reproduce because they can depend on very minute differences in timing once the critical section of code has been identified the programmer can add explicit locks like I've done here on the right um and that's actually going to cause threads to block when another thread is already accessing that section of code so that means that we won't um have anything we able to observe inconsistent state or here's another alternative you could just use the magic synchronized keyword why doesn't Java just add this onto all your methods

automatically well because the whole point of current programs is to run fast and have high throughput locks have a performance cost and synchronization explicitly eliminates concurrency so they need to be placed strategically and the burden of figuring all this out is placed on the programmer it's no wonder that race conditions are another common source of costly bugs and given that Cooperative that that the Cooperative alternative is both fast and eliminated many sources of race conditions it looks in hindsight like a

billion dollar mistake so again there's a problem we need we've got all these cores we need to use them efficiently the solution well the operating system does it that way let's do it the same we'll just have preemptive threading and all those threads can access the same the same state but then the consequences um you put those two things together and you you get race conditions um and what's the alternative well there's there's examples of it right Cooperative threading instead of preemptive and maybe don't use mutable

shed state if you can get away with it um and also if you're stuck in an environment where you do have this this kind of problem you've got to use locks you've just got to do the work okay last one from this section manual memory management if everyone in this room had their own list of billion dollar programming mistakes I suspect memory management would crop up on a lot of those lists and for good reason right it's one we keep hearing about when a vulnerability is discovered and they have really cool names too like heart bleed W to cry Cloud

bleed let let cut right to the heart of the problem with this one it's basically a c and C++ thing so C's approach to memory management can be summed up in two parts number one vest all power and Trust in the programmer and number two keep the compiler really simple so I've got a basic example here first thing that we're doing in this example is we're declaring a kind of data type that we're going to use in our in our things just a you know dummy example so here our data type just has a string called a name but a string here is represented as a pointer to memory and we're just

telling C that pointer to memory represents a list of characters and we hope that that's what there is in there by the way how would you know how long the string was that was in there it's not referenced there right you literally have to go in and look at all the bites like follow that that pointer look at all the bites and look for a null bite at the end or a zero bite at the end hopefully it's there otherwise you'll read right past the end of your string and into some other data structure and who knows maybe you even can get a seg okay we've got our structure the

first thing that we're going to do in our in our little program here is we're going to create an instance of this structure and for that we have to allocate our own memory for our string so we're calling Malik to do that to to get 16 bytes of memory and then we are um telling C this raw pointer that we've just allocated I want to treat it like it's an array of characters and now we can just copy some data into there um few things to be careful of here I didn't do any error handling here Mallet can return zero I haven't checked for that here so we would just dreference that that'd be another sort of like a

null reference error but here you're not going to get an exception and you're possibly just going to get a seg fold or who knows um also be very very careful when you copy your string into here that your string isn't longer than the amount of memory you allocated because you'll just write straight off the end of the array maybe another seg fult maybe you'll just corrupt some other data structure in your memory okay if you get past that what we're doing now is we're going to make an a copy of our structure so we're going from copying from our source structure to our desk structure and we're going to fully exploit the power of pointers here by just having a helper

function to copy structures that all it does you just pass it these like raw pointers and a length and it'll just copy whatever it finds at those pointers from one to the other so we're just copying our structure but note that that basically means it's a shallow copy and in particular we haven't copied the full string from one to the other we've just copied the pointer to the string so we're done with our first structure now we allocated that memory ourselves manually so we have to remember to deallocate it manually if you forget this step you've got a memory

Le and Fin finally we've still got our desk structure so let's print out the name that we had there and now your now your thing explodes because that pointer to that name is the one that we just freed on the previous step we'd Alias the pointer um so yes there's a basically a mindfield of things that you've got to get right and um what's the consequences of getting it wrong well it varies from mild to catastrophic as a lot of incidents show and where is the burden

of getting all this right well it's placed firmly on the shoulders of the human programmer this hasn't worked out well if you add up all the costs of all the cves in terms of losses to hackers cost of remediation as well as the cost of all the tooling and developer time spent on this it's probably more of a trillion dollar than a billion dollar problem even Barney St struck here the inventor of C++ might agree he said c makes it easy to shoot yourself in the foot C++ makes it harder but when you do it blows your whole leg

off oops um perhaps the most interesting question here is why are C and C++ still in such common usage is it because there are no good Alternatives well there is one alternative that I suspect most of us rely on so much that we don't even think about it and that's automatic memory management AKA garbage collection in languages like Java and C when we need to create an instance of an invoice or a hashmap we just say new invoice or new hashmap and that's that we don't write any code to deallocate these objects when we're done with them the garbage

collector reclaims our discarded objects automatically um but in order to get this you got to give up the power of pointers in these languages you can't read or write outside your data structures or reinterpret their bits as some other type but since that's a good thing in most applications a whole class of problems just goes away but sadly garbage collection isn't all rainbows it has a substantial performance cost it basically has to Traverse through all the memory of your program find all the data structures you

allocated at some point and check whether you're still referring to them from somewhere this can add noticeable latency and pauses to your application it's simply not suitable in some context like writing operating system kernels or game engines but it is a very good trade-off for most end user applications and websites and even server applications at this point I can hear a chant starting in the room rust rust rust use rust and rust is pretty amazing like C and C++ it gives you full power

and control over memory management but un likee C and C++ it has a strict static Checker called the borrow Checker that rejects code that is not provably memory safe so rust is a suitable language for operating system kernels in game engines and notably Linux is starting to adopt rust It's the Best of Both Worlds in terms of safety and performance but of course rust has to come along with its own costs Anyone who reads Hacker News or askprogramming can probably think of an article or two theyve come across about some team that

tried rust and found the experience to involve a a certain degree of friction even if they did succeed and a great deal of time spent thinking about memory management and how to construct a program that both did what they wanted and also satisfied the borrow Checker this friction can add up over time into higher development costs and lower productivity so basically we have something like this choose your memory management approach and get two out of three of these things note that I've put productivity in air quotes here because C and C++ are productive in the sense

that they allow you to ship your code quickly to prod just not without any safety guarantees so is there a billion dollar mistake here well yeah the obvious one is that the full power of memory management is probably too powerful for most programmers to wield and we've repeatedly blown our legs off to the point that um have released guidelines urging that c and C++ be dropped entirely for new projects from 2026 I also want to add in this section that I think it's also a mistake for

most programmers on most projects to be thinking about memory management at all when garbage collection is perfectly adequate for most applications rust is a safe as Java or C but it trades off some productivity to squeeze out additional runtime efficiency rust is likely to fill a lot of the gaps left as C and C++ become less used for critical software but for the kind of business applications and web servers that many of us work on that runtime efficiency those savings they're probably a lot smaller than the added

development costs of having to think about manual um managing memory so in summary there was a problem I mean we've got data structures in our programs they have to ultimately map to memory so how do we manage that memory um pointers super fast super easy to implement solution but the consequences again here's that theme they drastically increase the space of invalid programs that are actually acceptable by the compiler because they're they're

semantically valid um it can't detect the errors the Alternatives well we've looked at a couple their garbage collection uh and static analysis like the borrow Checker they both provide safety but they have different trade-offs of their own finally what about our next mistakes I think we're going to keep making mistakes for a couple of reasons firstly Computing is not an Natural Science with physics or biology we can observe nature and get more and more

accurate models over time because because the physical world is the ultimate validator of our ideas Computing is just pure ideas based on our own abstract Concepts so we basically have no guide we just try things out and we keep trying we keep things that seem to work and we discover the downsides later also in contrast to Natural Sciences we rarely bother to back up our claims and opinions with empirical data I've got books that say a certain architecture or Paradigm is better because it's simpler or it's more

maintainable but there are no footnotes pointing to anything studies simpler and more maintainable compared to what based on what measurement there are a couple of areas where we seem to be in flux right now with competing claims about mistakes and in the interest of time I'm just going to mention these really quickly with a single slide each the first one is error handling and this is my summary meme slide so you get the the vague idea of what I would talk about if there was time for this um what I find interesting is that two Spider-Man here are quite

adamant at the other approach is a mistake but they often use the same reasons as the other it's usually some variation on well the other way makes it harder to understand your programs Behavior or the other way makes it easier not to handle errors properly they can't both be right but at least they both agree on one thing and that's Java checked exceptions that were a terrible idea um oop versus FP would have been another really fun thing to talk about is there a billion dollar mistake there somewhere um but I actually talked about

that here last year so if you're interested you can check out that talk on YouTube um but my prediction is that we're going to eventually get to a point where we're going to look back on purist ideas like Javas everything has to be a class as a mistake mostly just because it isn't the best tool for every job and um and if you're forced to use nothing but hammers for everything it sometimes take takes more time and effort to produce something and that thing you produce looks like a pile of hammered stuff the signs are already here I think

languages are combining oo and FP constructs U Java and C are adopting a lot of ideas from functional programming and new languages are finding new Solutions TOS think rust traits instead of inheritance or structural type systems like typescript and finally this is the last slide I just threw into this deck because I thought well there has to be one right yeah um in terms of this talk I think we can say well you know could

be a billion dollar mistake here yeah you know maybe in a few years we'll be going like the billion dollar mistake is the is the huge cohort of software developers in the world that we just don't need anymore or maybe maybe the mistake will be letting the AI write the code you know leading to well who knows what but all I want to say I don't think we have enough hindsight yet to say which is which but it's another one obviously that we're all watching very clean keenly so

conclusions well every section had one of these and you could see that there were a few things that were pretty common so I've just created a template one here you know the problem it's whatever today's problem is the solution well here's a really powerful way that solves all of that and it's like really flexible um and it's also easy easy to implement but then there's consequences with enough hindsight we notice that our solution allows this large class of incorrect programs to be accepted and all we do is we rely on human manual checks to detect when there's problems

and that doesn't usually go very well and eventually we invent better constructs that constrain the design space and provide better static checks um but then the these Alternatives usually have new trade-offs and new problems I think if you just want a sound bite Eric s put it nicely when he said the chief cause of problems is Solutions also hindsight is 2020 is probably apt here right like we we see these things with hindsight but they're well intentioned in the beginning no one

sets out to make a mistake so saying blah was a billion dollar mistake is basically the same as saying in hindsight there was a cheaper or a better or a safer alternative to blur also software is complex most bugs are due to the ability to express bugs easily often without even trying the space of programs is huge just think of all the valid text you know text files that would represent some program or

another you know the space of syntax ially valid programs is massive but a small subset of that will be the space of semantically valid programs and then a tiny subset of that is the space of not only semantically valid but also bug free because semantically valid includes all the buggy programs that you could possibly write which is a much bigger set than the ones that don't have bugs and even if you get it down to the correct subset you still don't know if anyone wants this you know is it actually fit for purpose is it maintainable is it

useful that little set is what you want to hit but it's a tiny Target in this massive sea of other possible ways you could express a program so what we really need to do is shrink wrap the design space as closely as possible around the correct programming uh around the correct programs using the tools of our programming languages but also know closer because otherwise good programs become hard to express as Rico Mariani put it what we want to create is a pit of success so that when fallible humans do the obvious thing it also tends to be the correct

thing um I think the idea was also put pretty well in this paper out of the tarpit he said power corrupts what we mean by this is that in the absence of language enforce guarantees I.E restrictions on the power of the language mistakes and abuses will happen this is the reason that garbage collection is good the power of M memory manual memory management is removed and finally don't trust the programmer if there's one thing you learned today it

doesn't matter that the hole is easily avoided it just matters that many will fall into it regardless so try to make good easy and make bad hard thank [Applause] you thank you thank you to toy for such a great talk uh let's uh take it on the floor questions pray raise your hand and then I the mic will come to you

what do you think about the role of um billion dollar mistakes being caused less by a solution that is inherently Flor and more Solution that's of its time that becomes harmful when that context is removed like for instance not having garbage collection was more justifiable back you

know in the 60s and 70s when they simply didn't have the compute resources to spare but it became a billion dollar mistake as we gained those compute resources and still didn't adopted for these languages yeah I think it's a great Point um when garbage collection was first out there it would have been a big trade-off to think about right resources are very constrained another good example might be um if you're in the business of making compilers one thing that was

really important is that they stream your programs they never basically see the entire context of the source code and they they never have the context of the entire thing that they produce on the other end because there's just not enough memory to to have all that but you probably don't need to worry about that now right you just you could basically have all the text files all fed in at once all in memory and maybe even all the outputs all in memory at once as well so there are constraints that change over time if that's what you're what you're referring to yeah so I think we just have to sort of keep up with the times and notice which constraints don't apply anymore and

which things have now become just so cheap that we shouldn't bother with the um the the may be more efficient but more error prone Alternatives um hello so my question is you mentioned C and C++ might be replaced by rust because of the safety over productivity reason and do you think that's going to apply to like OS

kernels as well or just like new new critical softwares um to be honest I it's been a long time since I wrote any C++ um but it was true what I said in the talk that there's been an advisory recently that you know um the US government is going to start basically rejecting new projects for government contracts that are written in C and C++ so I think there's going to be a lot of pressure to move away from it um I honestly I I couldn't answer the more detailed

question of how things will go in other you know like non-critical Fields I mean you can be very productive in C and C++ and there's enormous enormous code bases out there I mean there's still cobal code Bas code bases running out there right so um I think they'll be around for a long time to come thanks Pro I really I enjoyed your talk um thank you thanks uh I was just hoping you could talk more to the functional programming and U where you see some of the billion dollar mistakes

uh in that space yeah um yeah it was something I thought oh I would love to talk about that but i' probably just do a whole talk on it um to be honest like the the nutshell version of it is that I think both objectoriented and functional programming would potentially have their own version of the billion dollar mistake which is trying to be too pure like if you say everything is a class now you got to model everything as a class and it's not a great way of modeling everything um but functional programming has its its own version in

fact the very Foundation of functional programming is the Lambda calculus and the Lambda calculus is a calculus that says everything is a function including the values true and false they're not values they're functions that return the first argument or the second argument so and and everything is built up in Lambda calculus from functions within functions if you tried to actually write a you know line of business application that way I mean it would just take you forever and in the same way I think we probably we over complicate some of our

business applications by trying to figure out how to model everything with classes so um I think probably hybrid approaches are just the more cheap version because we're not being we're not being religious or ideological about our code we're just doing the you know the thing that works easiest and yeah that would be my summary hello oh hi love your talk love your talk last year as well um got a question regarding that um vend diagram slide it made me giggle because you had

productivity in qutes um that one it's a really good slide um so I wanted to propose the question to you at what point would you draw the line between like restriction in development velocity versus mistake prevention because personally I'm happy to go a little bit slower that means I'm not blowing my leg off every other second so what do you draw that line so you're talking about if you have a choice between say for

example rust and C++ right not you're not talking about a choice between rust and say C abstract just like gen okay so um you know Java memory safe right y but there's a lot of craft in the standard Library there's a lot of craft in just external user libraries building it pain in the neck right so where do you draw that line personally um I I mean I've fallen down

these holes a lot myself which is probably why I ended up you know talking about you know being interested in talking about this like I've gone down productivity holes because I'm using languages and C++ was actually one that got me for the longest I figured out you could do almost everything in three different ways in C++ you could do it kind of the standard normal way you could do the whole thing in template meta programming and have it all run at compile time um and you could do a whole bunch of stuff in macros as well um and what I realized after a long period of time is that I spend so much time not

working on the problem that I was orig trying to solve it's the X XY problem like I need to solve x but in order to do that I need to solve y and I would just get lost in this C of y z you know other things forgetting all about X so I would just say that to me if controlling the memory was really critical to the application I'm working on then I'd consider rust but if if I know that the application I'm working on which is like an app on a smart if it if it like even if it's like half as efficient if I

don't have to think about memory management at all that's probably what I would choose but you know if memory is really important then I think rust is probably the way to go now over C or C++ thank you to I think we don't have any more questions let's give a big Applause for toy again uh if you want to take this offline I I appreciate you taking a picture of his uh credential on the last slide there

otherwise uh if you haven't had a chance to get a souvenir the year book is still available at the info desk for $25 go quickly before you miss out thank you for coming and enjoy the after party if you staying otherwise I'll see you in a different season thank you

[Music] [Music]

[Music]

[Music]

[Music]

[Music]

oh