Terraform 是一个 IT 基础架构自动化编排工具,口号是 "Write, Plan, and create Infrastructure as Code",即基础架构即代码。Terraform 几乎支持所有市面上可见的云服务。
Terraform 主要解决云上硬件资源分配管理的问题。相较于 Chef、Puppet、Ansible 等软件配置工具,Terraform 提供的是软件配置之前的软硬件(基础)资源构建能力。
Terraform 作为基础设施即代码工具,解决云上硬件资源分配管理问题,支持多云环境。相较于 Ansible 等配置工具,Terraform 优势在于并发创建资源、State 文件记录状态及版本控制。内容涵盖软件安装、配置文件结构(Provider/Resource)、资源生命周期命令(init/plan/apply/destroy)、资源依赖关系构建、Provisioners 脚本执行、变量输入输出管理、Data Source 数据源查询以及 Module 模块化组织。同时介绍了 Terraform 与 Ansible 的结合方式,包括通过 Provisioner 调用或生成 Inventory 文件,实现从资源创建到环境配置的完整自动化流程。
Terraform 是一个 IT 基础架构自动化编排工具,口号是 "Write, Plan, and create Infrastructure as Code",即基础架构即代码。Terraform 几乎支持所有市面上可见的云服务。
Terraform 主要解决云上硬件资源分配管理的问题。相较于 Chef、Puppet、Ansible 等软件配置工具,Terraform 提供的是软件配置之前的软硬件(基础)资源构建能力。
就创建资源而言,两者都能完成,但 Terraform 优势在于:
Terraform 在资源生产成功后会在本地以 state 文件形式记录整个资源的详细信息,这些信息的记录使得模板定义的资源可以保证前后端的高度一致性,有利于后续对整个一套资源进行有效的版本控制。同时 Terraform 拥有 Data Source 功能,利用该功能可以实现对已有资源的获取,例如在生产资源之前查看当前有哪些可用区、哪些可用镜像等,所有这些都可以通过 DataSource 实现。
Terraform 和 Ansible 的结合
/usr/local/bin新建目录,并生成配置文件,例如 azure.tf:
# Configure the provider
provider "azurerm" {
version = "=1.20.0"
}
# Create a new resource group
resource "azurerm_resource_group" "rg" {
name = "royTR"
location = "eastasia"
}
配置有两部分:provider 和 resource。provider 告知与哪一个云平台打交道,这里是 Azure;如果使用 AWS,这里就写成 provider "aws"。第二部分是资源,说明要生成哪些资源,例子中是 resource group,还可以继续往下写,比如网卡、存储、虚拟机等。
格式:resource resource_type resource_name { }
A resource block has two string parameters before opening the block: the resource type (first parameter) and the resource name (second parameter). The combination of the type and name must be unique in the configuration.
我已经通过 Azure CLI 登陆过,所以上面 provider 部分没有提供用户验证信息,如果单独配置,使用如下形式:
# Configure the Microsoft Azure Provider
provider "azurerm" {
# More information on the authentication methods supported by
# the AzureRM Provider can be found here:
# http://terraform.io/docs/providers/azurerm/index.html
subscription_id = "..."
client_id = "..."
client_secret = "..."
tenant_id = "..."
}
这些信息怎么获取?可以用 Azure CLI 的命令生成:
az ad sp create-for-rbac --role="Contributor" --scopes="/subscriptions/${SUBSCRIPTION_ID}"
在初始化项目的时候,Terraform 会解析目录下的 *.tf 文件并加载相关的 provider 插件。
$ terraform init
Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "azurerm" (1.20.0)...
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure.
All Terraform commands should now work.
If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory.
If you forget, other commands will detect it and remind you to do so if necessary.
This output shows the execution plan, describing which actions Terraform will take in order to change real infrastructure to match the configuration.
$ terraform apply .
An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
+ azurerm_resource_group.rg
id: <computed>
location: "eastasia"
name: "royTR"
tags.%: <computed>
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
azurerm_resource_group.rg: Creating...
location: "" => "eastasia"
name: "" => "royTR"
tags.%: "" => "<computed>"
azurerm_resource_group.rg: Creation complete after 1s (ID: /subscriptions/7c91db0e-eb7f-491b-997f-32cf55b85dea/resourceGroups/royTR)
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
$ terraform state show
id = /subscriptions/7c91db0e-eb7f-491b-997f-32cf55b85dea/resourceGroups/royTR
location = eastasia
name = royTR
tags.% = 0
更多命令示例:
$ terraform state list
module.roy-azure.azurerm_availability_set.hdp-avset
module.roy-azure.azurerm_network_interface.bastion-nic
...
$ terraform state show module.roy-azure.azurerm_virtual_machine.hdp-slave[1]
...
location = japaneast
name = roy-tf0-hdp-slave-02
...
$ terraform state show module.roy-azure.azurerm_network_interface.hdp[0]
...
ip_configuration.0.load_balancer_backend_address_pools_ids.# = 0
ip_configuration.0.load_balancer_inbound_nat_rules_ids.# = 0
ip_configuration.0.name = hdp-01-ip-conf
....
private_ip_address = 10.0.10.8
...
修改刚才的文件,添加 tag 部分。
# Configure the provider
provider "azurerm" {
version = "=1.20.0"
}
# Create a new resource group
resource "azurerm_resource_group" "rg" {
name = "royTR"
location = "eastasia"
tags {
environment = "TF sandbox"
}
}
An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
~ azurerm_resource_group.rg
tags.%: "0" => "1"
tags.environment: "" => "TF sandbox"
Plan: 0 to add, 1 to change, 0 to destroy.
terraform destroy
$ terraform destroy
azurerm_resource_group.rg: Refreshing state... (ID: /subscriptions/xxxx/resourceGroups/royTR-rg)
An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols:
- destroy
Terraform will perform the following actions:
- azurerm_resource_group.rg
Plan: 0 to add, 0 to change, 1 to destroy.
Do you really want to destroy all resources?
Terraform will destroy all your managed infrastructure, as shown above.
There is no undo.
Only 'yes' will be accepted to confirm.
Enter a value: yes
azurerm_resource_group.rg: Destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxx/resourceGroups/royTR-rg, 10s elapsed)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg, 20s elapsed)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg, 30s elapsed)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg, 40s elapsed)
azurerm_resource_group.rg: Destruction complete after 48s
Destroy complete! Resources: 1 destroyed.
单独删除一个资源:
$ terraform destroy -target=module.roy-azure.azurerm_virtual_machine.hdp[2]
...
An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols:
- destroy
Terraform will perform the following actions:
- module.roy-azure.azurerm_virtual_machine.hdp[2]
Plan: 0 to add, 0 to change, 1 to destroy.
Do you really want to destroy all resources?
....
Destroy complete! Resources: 1 destroyed.
要创建一个 VM,需要一些资源已经具备,这些资源可能包括:
先来一个简单的例子,创建网络:
# Create virtual network
resource "azurerm_virtual_network" "vnet" {
name = "royTFVnet"
address_space = ["10.0.0.0/16"]
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
}
在 location 等部分引入了插值(interpolation),它已经在前面的资源定义,之后直接调用,格式是 TYPE.NAME.ATTRIBUTE。
Azure 网络和虚拟机的基础架构如下图所示:

把上面的图,变成代码,创建 VM 需要的整个文件:
# Configure the provider
provider "azurerm" {
version = "=1.20.0"
}
# Create a new resource group
resource "azurerm_resource_group" "rg" {
name = "royTR"
location = "eastasia"
tags {
environment = "TF sandbox"
}
}
# Create virtual network
resource "azurerm_virtual_network" "vnet" {
name = "royTFVnet"
address_space = ["10.0.0.0/16"]
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
}
# Create subnet
resource "azurerm_subnet" "subnet" {
name = "royTFSubnet"
resource_group_name = "${azurerm_resource_group.rg.name}"
virtual_network_name = "${azurerm_virtual_network.vnet.name}"
address_prefix = "10.0.1.0/24"
#address_prefix = "${cidrsubnet(var.cluster_cidr, 8, 10)}"
}
# Create public IP
resource "azurerm_public_ip" "publicip" {
name = "myTFPublicIP"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
public_ip_address_allocation = "dynamic"
}
# Create Network Security Group and rule
resource "azurerm_network_security_group" "nsg" {
name = "myTFNSG"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
security_rule {
name = "SSH"
priority = 1001
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = "*"
}
}
# Create network interface
resource "azurerm_network_interface" "nic" {
name = "myNIC"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
network_security_group_id = "${azurerm_network_security_group.nsg.id}"
ip_configuration {
name = "myNICConfg"
subnet_id = "${azurerm_subnet.subnet.id}"
private_ip_address_allocation = "dynamic"
public_ip_address_id = "${azurerm_public_ip.publicip.id}"
}
}
# Create a Linux virtual machine
resource "azurerm_virtual_machine" "vm" {
name = "royTFVM"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
network_interface_ids = ["${azurerm_network_interface.nic.id}"]
vm_size = "Standard_DS1_v2"
storage_os_disk {
name = "myOsDisk"
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Premium_LRS"
}
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "16.04.0-LTS"
version = "latest"
}
os_profile {
computer_name = "royvm"
admin_username = "royzeng"
}
os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
path = "/home/royzeng/.ssh/authorized_keys"
key_data = "ssh-rsa AAAAB3Nz{snip}hwhqT9h"
}
}
}
Provisioners 可以在资源创建/销毁时在本地/远程执行脚本。
Provisioners 通常用来引导一个资源,在销毁资源前完成清理工作,进行配置管理等。
Provisioners 拥有多种类型可以满足多种需求,如:文件传输(file),本地执行(local-exec),远程执行(remote-exec)等。Provisioners 可以添加在任何的 resource 当中:
# Create a Linux virtual machine
resource "azurerm_virtual_machine" "vm" {
<...snip...>
provisioner "file" {
connection {
type = "ssh"
user = "royzeng"
private_key = "${file("~/.ssh/id_rsa")}"
}
source = "newfile.txt"
destination = "newfile.txt"
}
provisioner "remote-exec" {
connection {
type = "ssh"
user = "royzeng"
private_key = "${file("~/.ssh/id_rsa")}"
}
inline = [
"ls -a",
"cat newfile.txt"
]
}
}
上面的方式适合有 public ip,能够直接连接的机器,对于不能直接连接的 vm,通过跳板来实现。
官方的方法,定义
resource "null_resource" "connect_private" {
connection {
bastion_host = "${aws_instance.bastion.public_ip}"
host = "${aws_instance.private.private_ip}"
user = "ubuntu"
private_key = "${file("~/.ssh/id_rsa")}"
}
provisioner "remote-exec" {
inline = ["echo 'CONNECTED to PRIVATE!'"]
}
}
或者
resource "azurerm_virtual_machine" "vm" {
<...snip...>
provisioner "remote-exec" {
connection {
bastion_host = "${azurerm_public_ip.bastion.ip_address}"
type = "ssh"
user = "${var.admin-username}"
private_key = "${file("~/.ssh/id_rsa")}"
}
inline = [
"sudo parted /dev/disk/azure/scsi1/lun0 mklabel msdos",
"sudo parted /dev/disk/azure/scsi1/lun0 mkpart primary 1 100%",
"sudo partprobe",
"sleep 5; sudo mkfs.xfs /dev/disk/azure/scsi1/lun0-part1",
"sudo mkdir /roytest",
"sudo mount /dev/disk/azure/scsi1/lun0-part1 ${var.mount_path[0]}",
"echo 'UUID='`sudo blkid -s UUID -o value $(readlink -f /dev/disk/azure/scsi1/lun0-part1)` ${var.mount_path[0]} 'xfs defaults 0 0' | sudo tee -a /etc/fstab",
"df -hl | grep /dev/sd"
]
}
}
另一种方法,用 local-exec 来跳转
provisioner "local-exec" {
## 简化方式
command = "ssh -o \"ProxyCommand ssh -q -W %h:%p -i mykey jump_server\" -C 'echo hello'"
## 真实环境用的方式
command = <<EOF
sleep 30; ansible-playbook -i '${element(azurerm_network_interface.master_bind.*.private_ip_address, count.index)},' ${local.ansible_ssh_args} ${var.ansible_path}/mount_disk.yml --extra-vars '{ "root_user": "centos", "deviceName": "/dev/disk/azure/scsi1/lun0", "mountPath": "${var.mount_path}", "bind_zone_name": "${var.bind_zone_name}" }'
EOF
}
为了让 ansible 脚本单独运行,而不需要创建或销毁资源,可以用 null_resource 调用 provisioner 来实现。
resource "null_resource" "datanode" {
count = "${var.count.datanode}"
triggers {
instance_ids = "${element(aws_instance.datanode.*.id, count.index)}"
}
provisioner "remote-exec" {
inline = [
...
]
connection {
type = "ssh"
user = "centos"
host = "${element(aws_instance.datanode.*.private_ip, count.index)}"
}
}
}
# file variables.tf
variable "prefix" {
default = "royTF"
}
variable "location" {
}
variable "tags" {
type = "map"
default = {
Environment = "royDemo"
Dept = "Engineering"
}
}
文件中 location 部分没有定义,运行 terraform 的时候,会提示输入:
$ terraform plan -out royplan var.location
Enter a value: eastasia
<...snip...>
This plan was saved to: royplan
To perform exactly these actions, run the following command to apply:
terraform apply "royplan"
命令行输入
$ terraform apply \
>> -var 'prefix=tf' \
>> -var 'location=eastasia'
文件输入
$ terraform apply \
>> -var-file='secret.tfvars'
默认读取文件 terraform.tfvars,这个文件不需要单独指定。
环境变量输入
TF_VAR_name,比如 TF_VAR_location
对于 list 变量
# 定义 list 变量
variable "image-RHEL" {
type = "list"
default = ["RedHat", "RHEL", "7.5", "latest"]
}
# 调用 list 变量
storage_image_reference {
publisher = "${var.image-RHEL[0]}"
offer = "${var.image-RHEL[1]}"
sku = "${var.image-RHEL[2]}"
version = "${var.image-RHEL[3]}"
}
map 是一个可以被查询的表。
variable "sku" {
type = "map"
default = {
westus = "16.04-LTS"
eastus = "18.04-LTS"
}
}
查询方式 (使用 lookup)
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "${lookup(var.sku, var.location)}"
version = "latest"
}
定义输出
output "ip" {
value = "${azurerm_public_ip.publicip.ip_address}"
}
测试
$ terraform apply
...
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
ip = 52.184.97.1
$ terraform output ip
52.184.97.1
注意:第一次运行,ip 输出是空的,terraform output ip 命令的结果也是空的,过一段时间才能看到结果。
$ terraform output -module=roy-azure
bastion-private-ip = 10.0.1.4
bastion-public-ip = 40.115.243.72
cluster_cidr = 10.0.0.0/16
cluster_location = japaneast
cluster_prefix = roy-tf0
cluster_resource_group = roy-tf0-rg
hdp-master-ip = 10.0.10.4,10.0.10.6,10.0.10.7
hdp-master-name = roy-tf0-hdp-master-01,roy-tf0-hdp-master-02,roy-tf0-hdp-master-03
hdp-slave-ip = 10.0.10.5,10.0.10.9,10.0.10.8
hdp-slave-name = roy-tf0-hdp-slave-01,roy-tf0-hdp-slave-02,roy-tf0-hdp-slave-03
k8s-master-ip = 10.0.20.8,10.0.20.5
k8s-master-name = roy-tf0-k8s-master-01,roy-tf0-k8s-master-02
k8s-slave-ip = 10.0.20.6,10.0.20.7,10.0.20.4
k8s-slave-name = roy-tf0-k8s-slave-01,roy-tf0-k8s-slave-02
virtual_network = roy-tf0-vnet
DataSource 的作用可以通过输入一个资源的变量名,然后获得这个变量的其他属性字段。
用 Azure 网络来举例,提供一些信息,查询其它的属性。具体必须提供什么,能查到什么,参考官方文档。
data "azurerm_virtual_network" "test" {
name = "production"
resource_group_name = "networking"
}
output "virtual_network_id" {
value = "${data.azurerm_virtual_network.test.id}"
}
output "virtual_network_subnet" {
value = "${data.azurerm_virtual_network.test.subnets[0]}"
}
Ansible 通过主机列表来连接目标主机,我们就要想办法让 Terraform 来生成。(local-exec 也是一种方式,这是另一种思路:用 Terraform 调用 Ansible)
Terraform 生成 inventory 的思路是:从模板到文件,需要先用 template_file 渲染成一个字符串,然后用 local_file 把这个字符串输出到一个文件。
模版文件
## file inventory.tpl
[backend]
${bastion_private_ip}
[frontend]
${bastion_pub_ip}
[all:vars]
ansible_ssh_private_key_file = ${key_path}
ansible_ssh_user = dcpuser
渲染和输出
## file inventory.tf
data "template_file" "inventory" {
template = "${file("./test/inventory.tpl")}"
vars {
bastion_private_ip = "${element(azurerm_network_interface.bastion-nic.*.private_ip_address, count.index)}"
bastion_pub_ip = "${element(azurerm_public_ip.bastion.*.ip_address, count.index)}"
key_path = "~/.ssh/id_rsa"
}
}
resource "local_file" "save_inventory" {
content = "${data.template_file.inventory.rendered}"
filename = "./myhost"
}
运行后,当前目录生成文件 myhost
[backend] 13.78.94.242
[frontend] 10.0.1.4
[all:vars]
ansible_ssh_private_key_file = ~/.ssh/id_rsa
ansible_ssh_user = dcpuser
对于多个主机,使用 join 来把它们合在一起。
File inventory.tf
data "template_file" "k8s" {
template = "${file("./templates/k8s.tpl")}"
vars {
k8s_master_name = "${join("\n", azurerm_virtual_machine.k8s-master.*.name)}"
}
}
resource "local_file" "k8s_file" {
content = "${data.template_file.k8s.rendered}"
filename = "./inventory/k8s-host"
}
File k8s.tpl
[kube-master]
${k8s_master_name}
Final result
[kube-master] k8s-master-01 k8s-master-02 k8s-master-03
是 Terraform 为了管理单元化资源而设计的,是子节点,子资源,子架构模板的整合和抽象。将多种可以复用的资源定义为一个 module,通过对 module 的管理简化模板的架构,降低模板管理的复杂度,这就是 module 的作用。
Terraform 中的模块是以组的形式管理不同的 Terraform 配置。模块用于在 Terraform 中创建可重用组件,以及用于基本代码组织。每一个 module 都可以定义自己的 input 与 output,方便代码进行模块化组织。
用模块,可以写更少的代码。比如用下面的代码,调用已有的 module 创建 vm。
# declare variables and defaults
variable "location" {}
variable "environment" {
default = "dev"
}
variable "vm_size" {
default = {
"dev" = "Standard_B2s"
"prod" = "Standard_D2s_v3"
}
}
# Use the network module to create a resource group, vnet and subnet
module "network" {
source = "Azure/network/azurerm"
version = "2.0.0"
location = "${var.location}"
resource_group_name = "roytest-rg"
address_space = "10.0.0.0/16"
subnet_names = ["mySubnet"]
subnet_prefixes = ["10.0.1.0/24"]
}
# Use the compute module to create the VM
module "compute" {
source = "Azure/compute/azurerm"
version = "1.2.0"
location = "${var.location}"
resource_group_name = "roytest-rg"
vnet_subnet_id = "${element(module.network.vnet_subnets, 0)}"
admin_username = "royzeng"
admin_password = "Password1234!"
remote_port = "22"
vm_os_simple = "UbuntuServer"
vm_size = "${lookup(var.vm_size, var.environment)}"
public_ip_dns = ["roydns"]
}
## file main.cf
module "roy-azure" {
source = "./test"
}
## file test/resource.tf
variable "cluster_prefix" {
type = "string"
}
variable "cluster_location" {
type = "string"
}
resource "azurerm_resource_group" "core" {
name = "${var.cluster_prefix}-rg"
location = "${var.cluster_location}"
}

微信公众号「极客日志」,在微信中扫描左侧二维码关注。展示文案:极客日志 zeeklog
将字符串编码和解码为其 Base64 格式表示形式即可。 在线工具,Base64 字符串编码/解码在线工具,online
将字符串、文件或图像转换为其 Base64 表示形式。 在线工具,Base64 文件转换器在线工具,online
将 Markdown(GFM)转为 HTML 片段,浏览器内 marked 解析;与 HTML转Markdown 互为补充。 在线工具,Markdown转HTML在线工具,online
将 HTML 片段转为 GitHub Flavored Markdown,支持标题、列表、链接、代码块与表格等;浏览器内处理,可链接预填。 在线工具,HTML转Markdown在线工具,online
通过删除不必要的空白来缩小和压缩JSON。 在线工具,JSON 压缩在线工具,online
将JSON字符串修饰为友好的可读格式。 在线工具,JSON美化和格式化在线工具,online