Creating a Terraform Provider for Just About Anything
how's it going everyone like she said my name is Eddie's neski and I served a developer community at a company called digital ocean quick show of hands who here has used a digital ocean tutorial before it's a lot of people despite what many people think we are not a tutorial company we are a cloud provider so we've got all those things that you need to do the cool stuff cool all right so today we're here to talk about terraform I'm also joined with a few awesome colleagues from do especially Tom right over there wave you here Tom we have a huge Interpol to point internally so if
you ever want to talk about like rolling your own volt deployment come see Tom at our booth over there cool so we are here to talk about what is terraform can I get a gauge of the audience who here is used terraform hands up okay that's most of everyone who here has contributed to or written a terraform provider okay very cool and so don't need to talk too much about what terraform is we do need to talk about
how a LAN musk wants to terraform Mars alright so terraform I like to describe terraform as a giant state machine right it's we can talk about as infrastructure as code we can talk about a lot of things at the end of the day Tomer form is a great state machine right that's that little terraform that state file that you see and the nice thing is that you can have any resource that is backed by some form of an API all right you can be JSON API G RPC XML API any type of
resource that is backed by an API can be turned into a terraform provider and what terraform does is its heart is it's really just marshaling resources between a JSON payload and a internal terraform struct called a resource data that we'll talk about so it's really as awesome as this tool is its roots are very simple and it's definitely not that complicated to pick up so terraform is the state machine what is a terraform provider right so like I said any resource that
is backed by an API can be turned into a terraform provider so some examples right so there's like 90 terraform providers that are supported it's a lot on there myself and another colleague Andrew and I work on our digitalocean terraform provider so andrew at digitalocean comm for complaints and here's a quick example right of our terraform provider right you have a provider and then you have this attribute in there for a token and then a resource right and so you've
all seen HCL before but so you can manage resources you can manage your servers which is really cool you can also manage things like kubernetes right with a terraform provider for kubernetes instead of writing all that giant kubernetes AML you can now define your kubernetes services in HCl which is a lot better it doesn't you know not a ton of repeat there's not a ton of nesting it is like we haven't even done this yet has anyone use HCl with kubernetes have you liked the experience that's a
thumbs-up so and not only can we do things that are you know infrastructure and service related oh there's the github provider that everyone's probably seen and the cool thing about this is it's an API right and we have a provider for it so we can manage this resource via terraform state and can you imagine on your first day of work you come in and they're like okay like you know engineering onboarding all you have to do is make a pull request to this single repo to get access to everything you
want right and so you can define your your users in here you can define your repos what if you you know instead of having to hunt down that one person on your dev ops team who can create github repos you can just make a PR to your github repos repo and it's just a giant terraform file and that you can add it you can add all the users that are already in there as data sources and pull them in and grant access that way create teams that way and so you can start to see like where we're going this
right so that we could do something with like the marvelous set Vargo has done where he created a terrible one provider for Google Calendar right so what if we took all of our Google Calendar events and represented them in some set of terraform State right and so we've got an actual representation and this is he wrote an awesome blog post that you all should check out at the bottom there and it worked really well and so I was like okay like what other things we play with weather types of providers can we make and so who here is like heard of the
philips hue light bulbs before a couple so I wrote a terraform provider for the philips hue light bulbs right and so we use the attributes at the top to find the bridge that it's going to talk to and then we pull in the data resource who there's any who here has used data resources do I need to explain them so data resources like you have a resource that is you know being created and it's mutable you know you're messing around with it a data resource would be something that you don't want to modify but you want to import as a read-only type thing so there's all there's an
import command or you can import a resource but then when you run Tara from destroyed that resource gets destroyed a data resource is just like a permanent read-only thing right and so I pull in the light ID of my kitchen and then I set you know the the color to some kind of like awesome king fuchsia and boom now we're managing an actual resource in the real world with terraform and so what did we also did something like managing our to-do list so I I created another terraform provider this one's on
github at the bottom there and so this is a I use a to-do list client called todoist it's got an API behind it and you know my wife right now is packing so we can move to Denver on Saturday I'm really sorry Kaylee but so wait what if we implement these things so we can see them in the road so I have a quick demo that I'll show you of this terraform provider and so we'll start with like that main TF file and we'll pull in the provider and then you pass an API key
there but it's configured to an environment variable if you can't hack me and so we got a resource I will call this it to do is task and I'll call this like talk alright and we can have some content give talk boom alright so we saved that terraform an it terraform apply ok it wants to create me a resource boom if i pop over to my to-do list we could see that look it's created
a resource right and we can go in there and we can say ok like give talk right now right and update it and it will update the content right so you got where this is going we can do things like market as completed you know completed equals true and we save that and it's going to disappear because it was marked as completed right cool
and so let's bring it back and then we could do something like like at the data resource I was talking about so we'll have a data todoist project it's like in categories these and projects and I'll call this my talks project and the name of it will be talks I'm gonna I'm gonna show you how this is working it's not just showing it off and that we have a
project ID right and so we'll do some like interpolation so data dot news tasks our project project talks ID right okay save that so if I run this again it's it's mad about something of course oh it's because it's already created right so terraform destroy API is a little walkie you can't change a project
once it's created cool so created boom and if we look down here like in this talks it's popped up there right so so we took a resource that already existed which was our day Teresa our project pulled it in as a data resource so we could reference that another resource right so that's like the power of terraform cool and so let's actually dive into how that works and so why though right so why would we want to create these terraform providers can I
get a show of hands who here is made of terror from provider again okay that's a few of you how many of you have made one for internal stuff okay just about the same amount of people a few less right so some of the things you might want to build a provider for are your company maybe you're a cloud provider and you want to have a terraform provider for your customers to use maybe your customers have asked for one so they can manage your resources inside see on terraform you have a resource and an API that makes sense to manage via terraform you know some nice things to
do with this so you can skip a front and entirely right if you're working with HCl and terraform configuration files you don't need someone to write that react magic you know front-end you also don't need to create little duct-tape apps that implement the API that you have right everyone's had to do like a tiny little you know a one-off app that they run and it does like all the magic so so a terraform you don't actually need to do that because terraform is acting as your friend at API client and so the next reason is internal and so
building eternal apps for terrified providers for terraform is where I really see the power here and so we have a terraform provider called terraform provider do internal this is a very shallow shim / fork of our public one and it adds a few additional fields to it and so if terraform providers are written and go there's simple go plugins they're actually binaries that get compiled and they spin up a little RPC server that talks to the main terraform prophets this is really cool into the
hood so I have some resources at the answer you can read about them but the way this actually works is we have a internal repo that we can pull in the public repo and with go you can just add a you know a file with the same package name and compile it and everything magically gave some piles right you don't need to like you know do some crazy overriding go we'll just you know compile everything with the same package name and so we have a few additional fields that are represented like hypervisor placement you know this can influence where our VMs get placed
physically through the server acts you want to register into the chef server this is for like our internal management droplets we can also expose some beta features to internal people that are using because all our teams in do actually use terraform to manage some other infrastructure and so I was talking with someone I met last night who works at Target and they have some big beefy f5 load balancers right and if anyone's worked with f5 load bounces before you know that like you define your rules via ASIC firmware right you have to go in you have to write a bunch of like ASIC firmware compile that and
you flash your load balancer so they have your rules well these guys have built an API that hand all of that so they can go in make you know API requests and that API will compile the firmware flash it to their f5 load balancers so now he's interested in building a terraform provider because now you can add your rules through HCl and you don't have to have whatever front-end you're using before you don't have to have your curl request you know if you use postman and so think hopefully some of the wheels are starting turning you're thinking of some
internal API is that you can build a provider for who we are yes who couple people contributed to a provider you might want to contribute so you can fix a bug add a feature that's missing a quick plug for hack Tober fest it's a thing that we're running right now with github and Twilio if you make five pull requests err in the month of october we'll send you a sweet t-shirt so check it out Oktoberfest at digitalocean com and so you might also want to maintain a fork of a provider that exists so we actually maintain a fork of the CloudFlare provider because there are
some enterprise features that weren't implemented yet and we have like a legacy account so we have some like different older policies that aren't supported via the new API so you know if there's a provider that you need to make some changes on you can fork it and just manage it yourself and so the important thing to get to understand when it comes to working with terraform is everything is convention first I learned go through working on terraform and lots of people have told me that was a very big mistake because originally
terraform was written by some Ruby developers and you know they're they're definitely great at go now fun fact the original terraform code was written inside of the digital ocean office by Mitchell because he was in New York for the summer and it was really hot and they wanted to come to devotions air-conditioned office so everything was done through convention so this is something you have to accept right if you're if you're a hardcore go developer and you're gonna see some of the functions and you're gonna see some underscores in there with some snake
case in you're gonna be a little cringy you just have to embrace that convention comes first and so the question is where to start where you need to start is with a go API client so for that to do a provider I showed you I had to build a go API client you should look at it and you know it's definitely not well written because I don't know go that well but this is where you start right so you need to have a strong API client because the important thing is to separate your api's logic from your terraform providers logic if there is any types of
like weirdness in your your eight your actual API you want to address that in the API client you don't want to have to like put in shims and duct tape in your provider so you want to abstract these things as much as possible your client needs to have superb error handling because we're gonna talk about like what to do in ship breaks and then logging so you should have some kind of logging mechanism so you can quickly identify when something is broken if it's related to terraform or your API or your client so start with a strong API client and so there's skeleton's
available for starting with the provider you this guy down here is the the most important resource to read so check that out for getting started it's got a nice walkthrough where you build a hello world provider but the skeleton is simply a function in here that returns a terraform resource provider right so you return what we call a schema and another provider we'll get into in a second but the skeleton is very basic and there's like a main file that just registers
your plugin so it's it's not even worth getting into and so what you need to do to get started is actually read other providers this is where you're gonna find the most thing because outside of there that guy there aren't that many resources out there so look at the digitalocean provider look at the Amazon provider the Google provider is very well done all all their top cloud providers are extremely well done so look at those providers dig into the different types of resources and then look through the schemas I mean so schema is the most important thing to understand here and so this this comes
after that line of the funk declaration and so you return a schema dot provider and so you know if I can make this any bigger everything in here is done through resource maps so you have these these maps of a a property name to a actual schema declaration and so in that to do is provider remember I said you declare the API key if it wasn't through an environment variable this is simply what that looks like you you have this tiny little schema and terraform
immediately knows how to take in that variable and assign it where it needs to go it definitely works magically under the hood and the important thing to note is that all the type enforcement for terraform is done in the schema so while you're developing your plugin and your go functions and your crud functions you need to trust this is where the go developers are gonna get a little look you need to trust that the data coming in is what you've declared it is in the schema it's it's a little like you know you're hesitant at first but the the
schema level the external it won't slurp up that HDL file successfully if the schema is wrong so you have to trust that everything coming in is mapped to the schema correctly or else you're gonna have a lot of garbage code and so the types that are available are boolean integers float strings lists maps and sets and so step set where it gets really cool if sexauer you've seen in HCl you can have nested resources so maybe the you know that attendee on the Google Calendar invited had attendee and then email and my email address right so
the that the attendee is a type set and below them you can have a whole nother set of schemas and so this is how you can really be expressive with HCL in terraform there is something joked about called the terraform standard library this is everything that lives under the terraform tree slash helper pull it up real quick and so there are a ton of packages in here that will make your life better you're going to work with
these non-stop some of the ones to call out are the everything in the schema helper so under the schema folder there's a lot in here you got to read through the code it'll help it believe me it will very much help it so the code is all very well documented too and so the read the schema a couple of notable classes or objects to read is the provider the resource and the resource data and so
the resource data is what gets brought in to your functions we'll talk about in a second there's validators that I didn't know existed until you know down the road that you can you know put a validator that's like oh is the integer between right so there's a lot of these helper functions that are already written so if you like oh man I really need a convenience method right here there's a chance it exists so check check the helper repo and the testing library which we'll talk about later it's got a bunch of random helpers like random string and random
SSH key insert and stuff so read through the the source there and so the next call out here is the meta function and so I'm gonna show you what these crud functions look like inside terraform you're gonna see something called meta which can be very confusing coming in and this is where again the go developers are going to cringe because
terraform is so well done as a sdk in a library you have to it has to work with every type of api client out there that you wanna you know that you're gonna use and so use a lot of empty interfaces where you have to typecast and type assert things the purpose of this configure function is to return what they call the meta so in this case this is my todoist rest api client and so that that gets assigned here so this this fires up when you first launch your provider so you're
gonna see meta a lot just remember meta is your client that you define in your configure function and so the the very basic bare-bones unit is the resource and you can see here it's just a resource it's a map that you have in the top-level schema and you map the resource name to a function that returns a resource schema right and so you see this pattern a lot the same thing goes for the the data sources you're importers they all have
like a data sources map of importers map and it's a very common practice here and so we've got a few slides a bunch of code on it so bear with me the this is for that resource to do is tasks this is the actual schema that represents that to do this task item and so it returns a schema and it's got that content thing in there that I showed you and it's got types so the type that's declared and required it needs to be there right you need to have content to do so all these helpers all of this funk
Aldi is built right into the SDK that you're working with and same thing with like the the completed right it's a optional and it defaults to false if it's not there and then below that we have like the bread and butter which is the crud functions right so create read update delete again all this is doing is marshaling resources into terraform state objects and so we have a crud representation and terraform knows how to manage your entire api and so this is what it looks like for that create it
takes in a resource data object so again callback to earlier this is a very important one to read this is basically just a giant hash map that has a bunch of functionality helper methods put on it so you see here we grab you've got a like do that typecast right so we grab the meta out of this create function that gets passed in cast it to a client and then we take that resource data and we're gonna grab out the content right which again we're not doing type checking because it's done at the schema
level and so we type assert that it's a string and then we can go on and create our task check the error and the nice thing is all these crud functions they return an error and so what we'll talk in a little bit about like what happens when errors are returned but you don't actually need to do anything complex you just simply return an error something's wrong and terraformed knows how to process that that error based on the lifecycle method so the last call it is at the bottom here that is the read function which we're going to look at
next what you want to do with these is make them as composable and reusable as possible and so the read function will take in you know a meta and your resource data and instead of you know again it'll return an error so it's safe to do but instead of doing a bunch of very similar functions and code functionality you want to reuse these as much as possible and see we're just passing in the same D that we get above so we can reuse the read functionality and so the read is very simple as well you know we grab the client we grab the
ID out of the the resource data we make an API call to get that task and then this is the marshalling right so we take that resource data and we're gonna set the content to the you know the tasks content right and so this is this is it this is how terraform works you are you know sometimes you have a bit more logic in here but you are literally just marshaling resources from some kind of API response into this terraform resource data object and so it's real easy updating has a bunch of
convenience methods in here so we have the has change method which will actually do a read on the resource and it'll tell you all like you know content has changed something you can go through and update the content right so there's a lot of what it's it's boilerplate you can kind of like you know meta and abstract this a little bit but you're really just working with these crud functions same thing with delete you know you take in the resource you grab out the idea you delete it but successful you return no error cool so error handling this is
extremely important and again going back to the API client you need to make sure that your API logic is separated from your terraform provider logic right again if there's if there's band-aids you need to put in place fix your API fix your client try not to have this in your actual provider and so for your provider to be successful and be easy to use is you need to have extremely robust error handling and it needs to be very fault tolerant so you need to handle that random CloudFlare 5:06 that you get
back which is an HTML page for some reason from a you know json response you need to have logic to handle that we don't have to handle that specific case but you need to have a you know some error method that knows how this gets handled you want to make sure you log all the things there's a great functionality built into the helpers for logging you can just you know TF log equals info debug you know different log levels and it will print out a bunch of stuff so you're gonna use this a lot as
you're building your providers coming back you need to quickly identify what's your API is fault versus what's terraform is fault versus what's random and if we look at this little stack trace here this is from the terraform Doc's right so you're gonna see this type of thing a lot which is a giant panic stack trace when something goes wrong lucky they tell you like the key part of this message is the first two lines so just note here like you want to identify what is your responsibility as quickly as you can
just like reading any stacktrace and they jump right into you know the actual excuse me resource file in the line so you'll see these stack traces a lot quickly identify what's your fault and partial state is another thing that you'll have to deal with and so this is where you know maybe your resource was created or updated but only part of it was maybe the error happened after part of the resource got updated and just a
quick call out from the docs here if they create callback returns with or without an error and an ID has been set the resource is assumed created in all state is saved with it all right and so that is very important so read through this page at the bottom here they have a very large declaration on what happens to handle these errors and so just to make sure we understand the if you look at the very bottom there that set that
set ID function as long as that is successful it as long as the ID is set no matter what other errors are happening it's gonna assume that your resource was created successfully so a lot of caveats to learn so read the docs at the bottom there and build extreme well-done error handling in testing so this is definitely the biggest pain point that I've ran into while working on these providers and it's not that it's not well done it's just if you
think about like how do you provide a testing framework to test against a real API that's creating real resources in real time and so the this is this is very this is where I guess this is we're gonna spend the most time is like understanding these test frameworks and the test framework is like I said it's very well done it's just there's a lot here so walk through with me real quick this resource dot test function is how we declare a test it takes in a resource test case it has a pre-check function
this is where you like set up your API client and do a bunch of other stuff you declare your providers these are all usually define like one shared function and one like test config file and then you declare like a check destroy function and so after all your steps will run terraform tear phone will basically run terraform apply on your test case it'll run through all of your test steps to make sure things are done and then it will make sure that all of your resources are
done so you provided a function to like assert that your resources are actually destroyed and so the resource test steps here every this is the the bread and butter of the testing so every step every test step has a terraform config and so you can see here we want to do reusability so this is a a test acceptance test that's the test ACC so test acceptance check to do is test config underscore basic so you'll have a lot of different configs they look like
this you know you can define a function and it takes in you know the templated thing and then use like format string to you know print it out but you're you're literally just testing a terraform config in string form you don't have to declare like your provider that's all done at the start right and so you take a basic function like this you plug it into the config for the test step and then it has a bunch of helper methods that can do something like you know resource that check resource attribute and you know it knows how to check that
to do is test test the content matches whatever content you expect it to write so you're you're literally like spinning up resources asserting that they got to find where they are and then moving on to the next step and so you can have many steps for a single config and you use these steps to simulate things like updates and modifies hashing Corp has a teamcity server which is like JetBrains CI CD server and so all of their
providers are very continuously tested one of the requirements to like be official providers you have to provide them with the actual API key that they can use to test and provider right so this is constantly spinning up resources asserting that everything's working and that's how they're sure you know assuring the quality of these providers so the test flow right so it starts with that pre check then it goes through the test steps destroy than check destroy the you take one config per step new steps to
assimilate updates and deletes reuse as much as possible while you're writing these early on it's very easy to get a bunch of like you know spaghetti helpers a bunch of ton repeated a ton of repeated code so just approach this with the mindset of like I need to be reusable I need to be composable abstract as much as you can into these tiny functions that you can work with easier the makefile so you definitely want to copy and paste a make file from
the skeleton or other providers fun fact we didn't realize that you could run a single acceptance test we you know we thought we you had to run the entire twice suite for way too long you can't in fact run a single test suite I'm sorry a single resource test right so you do like make test account pass in the test dogs and the name of the test and so saves you a lot of time because these these test Suites can take like I think ours takes 79 minutes or something to run so they run in parallel but a lot of the steps are sequential
because you know you're modifying resources so the test we takes a long time take advantage of the fact that you can run one test at a time docs are it's very easy to get started with these there's a magic website folder in all the providers and so we'll look at this real quick so we got the magic website folder here it's got a you know this is like the index page so you just have a you know a very simple like navbar with links to the rest of the resources and
then see individual resources are under the docs folder you know and then you have the D for data sources and the R for resources right and these are just simple markdown files that you you've seen before and I used to working with you know it's just a very simple markdown file in ER B that gets parsed out and so the team City job runs and it basically takes all of the magic website folders from all the providers compiles them and deploys them to the terraform Doc's website so they have a really cool
CI CD job they've set up you don't really have to think about it if you want to get started contributing to a provider this is a great place to start so dive into the docks process notes I think I shoulda mentioned this you want to engage them early as your if you if you are working on what you want to be and accepted like you know not internal but like community type of provider engage hatchery Corp early they'll give you the resources they have a program they'll get you involved in and they'll they have a slack for like people who
are working on providers so you can get access to other folks who are also building providers so talk to actually corporately and use Travis CI CD for github testing releases are done via slack so this is pretty close so you just pop into that slack channel you're like hey can someone release you know version 1.0 of the digitalocean tear from provider and then a nice hashey Corp employee comes along they're like yeah I got you so there's a lot of magic that happens they did a really good job structure out this program to support
the ninety plus you know providers they have the process is at the bottom if you actually want to get it like an accepted sanctioned you know provider go through that read it it's it's very well done a couple tips so learn a bit of go first again I came at this I'm a JavaScript and Ruby developers so learn go you're gonna get very familiar with the ster con package this is going from like insta AG's and I'm sorry insta strings
and strings to intz you're gonna use it a ton especially because all the IDS in terraform need to be strings even if your resource has an you know an ID as a is it integer as an ID it needs to be a string inside terraform so you're gonna do a ton of converting back and forth and and go will get mad at you if you don't like do it right you to get around this you can use custom jade unmarshal json functions you can google it basically you can unmarshal into the struck like you're used to but you
declare your own unmarshal json function and you can have like a temporary resource in there that you convert from the ID into the string and then you know it was magic so use those you'll save a lot of actually converting on-the-fly have a solid go api client fix design problems with your api if possible understand tariff or module so we get a bunch of github issues where there's like a few caveats with modules where you need to even though you're taking a list you still always need to
like wrap your list when you're passing it through a module so just understand how tier four modules work because your community and your users are gonna take your provider use it in a module and then you know raise issues and bugs about it there's also HDL 2.0 has like null type which is really great you can actually set a resource to null instead of like the crazy go default that's like zero or empty string don't be afraid to copy and paste from other providers and your own provider basically to add a new resource right now I just copy from our droplet
which is our server resource provider change do like a you know finally replace everywhere to change the name and then worry about the logic so copy and paste the tun focus on composable and reusable functions and then there's something called sweepers this is like the oh thing that runs afterwards to make sure all the resources are deleted so you don't have like stragglers hanging around resource some resources are look at other providers there's terraform docked skylines and source code again read the source code a couple videos these are both through two
really good videos where they actually walk through building a terraform provider hands on live one of them is a little out-of-date but the concepts are still there so you know the api changed a little bit but watch these videos they're really good and then if you're looking for someone to actually work on a provider or a consulting company or you just you want to pay someone to get it done open credo is someone that we worked with for a little bit they kind of did some consulting and education for us so open creator we highly recommend them as a dev shop to work on providers and with
that I'm out of time right on the dot you can find me on the Internet's at Eddie Zane email Gmail digitalocean email get up Twitter this link for the slides are at the bottom here so if you want to grab the slides other than that thank you all so much for listening and hopefully this was helpful you