Skip to content
This repository has been archived by the owner on Apr 18, 2024. It is now read-only.

Commit

Permalink
Merge branch 'resource-manager'
Browse files Browse the repository at this point in the history
Conflicts:
	terraform/modules/network/outputs.tf
  • Loading branch information
smithzc committed Oct 15, 2019
2 parents 3c4059e + 9447522 commit 8d16552
Show file tree
Hide file tree
Showing 30 changed files with 487 additions and 343 deletions.
68 changes: 17 additions & 51 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,89 +1,55 @@
# oci-cloudera
This is a Terraform module that deploys [Cloudera Enterprise Data Hub](https://www.cloudera.com/products/enterprise-data-hub.html) on [Oracle Cloud Infrastructure (OCI)](https://cloud.oracle.com/en_US/cloud-infrastructure). It is developed jointly by Oracle and Cloudera.

## Alternate Versions
Future development will include support for EDH v5 clusters. In the meantime, use the [1.0.0 release](https://github.com/oci-quickstart/oci-cloudera/releases/tag/1.0.0) for v5 deployments.
## Deployment Information
The following table shows Recommended and Minimum supported OCI shapes for each cluster role:

| | Worker Nodes | Bastion Instance | Utility and Master Instances |
|-------------|----------------|------------------|------------------------------|
| Recommended | BM.DenseIO2.52 | VM.Standard2.4 | VM.Standard2.16 |
| Minimum | VM.Standard2.8 | VM.Standard2.1 | VM.Standard2.8 |

Host types can be customized in this template. Also included with this template is an easy method to customize block volume quantity and size as pertains to HDFS capacity. See [variables.tf](https://github.com/oracle/oci-quickstart-cloudera/blob/master/terraform/variables.tf#L48-L62) for more information in-line.
## Resource Manager Deployment
This quickstart leverages [OCI Resource Manager](https://docs.cloud.oracle.com/iaas/Content/ResourceManager/Concepts/resourcemanager.htm) to make deployment quite easy. Simply [download the latest .zip](https://github.com/oracle/oci-quickstart-cloudera/zipball/resource-manager) and follow the [Resource Manager instructions](https://docs.cloud.oracle.com/iaas/Content/ResourceManager/Tasks/usingconsole.htm) for how to build a stack. Prior to building the Stack, you may want to modify some parts of the deployment detailed in the sections below and the scripts [README](https://github.com/oracle/oci-quickstart-cloudera/blob/master/scripts/README.md).

## Prerequisites
First off you'll need to do some pre deploy setup. That's all detailed [here](https://github.com/oracle/oci-quickstart-prerequisites).
Alternatively you can also use a schema file to make setting deployment variables even easier. In order to leverage this feature, the GitHub zipball must be re-packaged so that it's contents are top-level prior to creating the ORM Stack. This is a straight forward process:
```
unzip oci-quickstart-cloudera*.zip
cd oci-quickstart-cloudera-<TAB_COMPLETE>
zip -r oci-quickstart-cloudera.zip *
```

### Clone the Module
Now, you'll want a local copy of this repo. You can make that with the commands:

git clone https://github.com/oracle/oci-quickstart-cloudera.git
cd oci-quickstart-cloudera
Use the oci-quickstart-cloudera.zip file created in the last step to create the ORM Stack. The schema file can even be customized for your use, enabling you to build a set of approved variables for deployment.

## Python Deployment using cm_client
The deployment script "deploy_on_oci.py" uses cm_client against Cloudera Manger API v31. As such it does require some customization before execution. Reference the header section in the script, it is highly encouraged you modify the following variables before deployment:

admin_user_name
admin_password
cluster_name

Also if you modify the compute.tf in any way to change hostname parameters, you will need to update these variables for pattern matching, otherwise cluster deployment will fail:

worker_hosts_prefix = 'cdh-worker'
namenode_host = 'cdh-master-1'
secondary_namenode_host = 'cdh-master-2'
cloudera_manager_host = 'cdh-utility-1'

In addition, further customization of the cluster deployment can be done by modification of the following functions:
These variables are not passed too instance metadata for security purposes, as such they are only present in the CloudInit deployment script. You can sanitize these after deployment by removing the contents of /var/lib/cloud/instance/scripts/.
In addition, advanced customization of the cluster deployment can be done by modification of the following functions:

setup_mgmt_rcg
update_cluster_rcg_configuration

This does require some knowledge of Python and Cloudera - modify at your own risk. These functions contain Cloudera specific tuning parameters as well as host mapping for roles.
This does require some knowledge of Python and Cloudera configuration - modify at your own risk. These functions contain Cloudera specific tuning parameters as well as host mapping for roles.

## Kerberos Secure Cluster option

This automation supports using a local KDC deployed on the Cloudera Manager instance for secure cluster operation. Please read the scripts [README](https://github.com/oracle/oci-quickstart-cloudera/blob/master/scripts/README.md) for information regarding how to set these parameters prior to deployment.
This automation supports using a local KDC deployed on the Cloudera Manager instance for secure cluster operation. Please read the scripts [README](https://github.com/oracle/oci-quickstart-cloudera/blob/master/scripts/README.md) for information regarding how to set these parameters prior to deployment if desired. This is now enabled by a True/False flag in ORM deployment, and is on by default.

Also - for cluster management, you will need to manually create at a minimum the HDFS Superuser Principal as [detailed here](https://www.cloudera.com/documentation/enterprise/latest/topics/cm_sg_using_cm_sec_config.html#create-hdfs-superuser) after deployment.

Enabling Kerberos is managed using a terraform metadata tag "deployment_type" which is set in [variables.tf](https://github.com/oracle/oci-quickstart-cloudera/blob/master/terraform/variables.tf#L32). Setting this value to "secure" will enable cluster security as part of the setup process. Changing this to "simple" will deploy an unsecured cluster.


## High Availability

High Availability is also offered as part of the deployment process. When secure cluster operation is chosen this is enabled by default. It can be disabled by either changing the deployment_type to "simple", or modifying the [deploy_on_oci.py](https://github.com/oracle/oci-quickstart-cloudera/blob/master/scripts/deploy_on_oci.py#L60) script and changing the value for "hdfs_ha" to False.
High Availability for HDFS services is also offered as part of the deployment process. This can be toggled during the installation process by setting the value to "True".

## Metadata and MySQL

You can customize the default root password for MySQL by editing the source script [cms_mysql.sh](https://github.com/oracle/oci-quickstart-cloudera/blob/master/scripts/cms_mysql.sh#L188). For the various Cloudera databases, random passwords are generated and used. These are stored in a flat file on the Utility host for use at deployment time.

## Object Storage Integration
As of the 2.1.0 release, included with this template is a means to deploy clusters with configuration to allow use of OCI Object Storage using S3 Compatability. In order to implement, an S3 Access and Secret key must be set up in the OCI Tenancy first. This process is detailed [here](https://docs.cloud.oracle.com/iaas/Content/Identity/Tasks/managingcredentials.htm#Working2). Once that is in place, modify the [deploy_on_oci.py](https://github.com/oracle/oci-quickstart-cloudera/blob/master/scripts/deploy_on_oci.py#L101-L108) script, and set the following values:

s3_compat_enable = 'False'
s3a_secret_key = 'None'
s3a_access_key = 'None'
s3a_endpoint = 'None'

The first should be set to 'True', then replace 'None" with each of the required values. This configuration will then be pushed as part of the cluster deployment.

## Deployment Syntax
Deployment of the module is straight forward using the following Terraform commands

terraform init
terraform plan
terraform apply

This will create all the required elements in a compartment in the target OCI tenancy. This includes VCN and Security List parameters. Security audit of these in the [network module](https://github.com/oracle/oci-quickstart-cloudera/blob/master/terraform/modules/network/main.tf) is suggested.

## Destroy the Deployment

When you no longer need the deployment, you can run this command to destroy it:

terraform destroy

## Deployment Architecture

Here is a diagram showing what is deployed using this template. Note that resources are automatically distributed among Fault Domains in an Availability Domain to ensure fault tolerance. Additional workers deployed will stripe between the 3 fault domains in sequence starting with the Fault Domain 1 and incrementing sequentially.

![Deployment Architecture Diagram](https://github.com/oracle/oci-quickstart-cloudera/blob/master/images/deployment_architecture.png)
Expand Down
31 changes: 11 additions & 20 deletions terraform/compute.tf → compute.tf
Original file line number Diff line number Diff line change
@@ -1,23 +1,20 @@
module "bastion" {
source = "modules/bastion"
instances = "1"
instances = "${var.bastion_node_count}"
region = "${var.region}"
compartment_ocid = "${var.compartment_ocid}"
subnet_id = "${module.network.bastion-id}"
availability_domain = "${lookup(data.oci_identity_availability_domains.ADs.availability_domains[var.availability_domain - 1],"name")}"
image_ocid = "${var.InstanceImageOCID[var.region]}"
ssh_keypath = "${var.ssh_keypath}"
ssh_private_key = "${var.ssh_private_key}"
ssh_public_key = "${var.ssh_public_key}"
private_key_path = "${var.private_key_path}"
bastion_instance_shape = "${var.bastion_instance_shape}"
log_volume_size_in_gbs = "${var.log_volume_size_in_gbs}"
cloudera_volume_size_in_gbs = "${var.cloudera_volume_size_in_gbs}"
user_data = "${base64encode(file("../scripts/boot.sh"))}"
user_data = "${base64encode(file("scripts/boot.sh"))}"
cloudera_manager = "cdh-utility-1.public${var.availability_domain}.${module.network.vcn-dn}"
cm_version = "${var.cm_version}"
cdh_version = "${var.cdh_version}"
deployment_type = "${var.deployment_type}"
}

module "utility" {
Expand All @@ -28,23 +25,23 @@ module "utility" {
subnet_id = "${module.network.public-id}"
availability_domain = "${lookup(data.oci_identity_availability_domains.ADs.availability_domains[var.availability_domain - 1],"name")}"
image_ocid = "${var.InstanceImageOCID[var.region]}"
ssh_keypath = "${var.ssh_keypath}"
ssh_private_key = "${var.ssh_private_key}"
ssh_public_key = "${var.ssh_public_key}"
private_key_path = "${var.private_key_path}"
utility_instance_shape = "${var.utility_instance_shape}"
log_volume_size_in_gbs = "${var.log_volume_size_in_gbs}"
cloudera_volume_size_in_gbs = "${var.cloudera_volume_size_in_gbs}"
user_data = "${base64encode(file("../scripts/cloudera_manager_boot.sh"))}"
cm_install = "${base64gzip(file("../scripts/cms_mysql.sh"))}"
deploy_on_oci = "${base64gzip(file("../scripts/deploy_on_oci.py"))}"
user_data = "${base64encode(file("scripts/cloudera_manager_boot.sh"))}"
cm_install = "${base64gzip(file("scripts/cms_mysql.sh"))}"
deploy_on_oci = "${base64gzip(file("scripts/deploy_on_oci.py"))}"
cloudera_manager = "cdh-utility-1.public${var.availability_domain}.${module.network.vcn-dn}"
cm_version = "${var.cm_version}"
cdh_version = "${var.cdh_version}"
worker_shape = "${var.worker_instance_shape}"
block_volume_count = "${var.block_volume_count}"
block_volume_count = "${var.block_volumes_per_worker}"
AD = "${var.availability_domain}"
deployment_type = "${var.deployment_type}"
hdfs_ha = "${var.hdfs_ha}"
secure_cluster = "${var.secure_cluster}"
cluster_name = "${var.cluster_name}"
}

module "master" {
Expand All @@ -55,18 +52,15 @@ module "master" {
subnet_id = "${module.network.private-id}"
availability_domain = "${lookup(data.oci_identity_availability_domains.ADs.availability_domains[var.availability_domain - 1],"name")}"
image_ocid = "${var.InstanceImageOCID[var.region]}"
ssh_keypath = "${var.ssh_keypath}"
ssh_private_key = "${var.ssh_private_key}"
ssh_public_key = "${var.ssh_public_key}"
private_key_path = "${var.private_key_path}"
master_instance_shape = "${var.master_instance_shape}"
log_volume_size_in_gbs = "${var.log_volume_size_in_gbs}"
cloudera_volume_size_in_gbs = "${var.cloudera_volume_size_in_gbs}"
user_data = "${base64encode(file("../scripts/boot.sh"))}"
user_data = "${base64encode(file("scripts/boot.sh"))}"
cloudera_manager = "cdh-utility-1.public${var.availability_domain}.${module.network.vcn-dn}"
cm_version = "${var.cm_version}"
cdh_version = "${var.cdh_version}"
deployment_type = "${var.deployment_type}"
}

module "worker" {
Expand All @@ -77,19 +71,16 @@ module "worker" {
subnet_id = "${module.network.private-id}"
availability_domain = "${lookup(data.oci_identity_availability_domains.ADs.availability_domains[var.availability_domain - 1],"name")}"
image_ocid = "${var.InstanceImageOCID[var.region]}"
ssh_keypath = "${var.ssh_keypath}"
ssh_private_key = "${var.ssh_private_key}"
ssh_public_key = "${var.ssh_public_key}"
private_key_path = "${var.private_key_path}"
worker_instance_shape = "${var.worker_instance_shape}"
log_volume_size_in_gbs = "${var.log_volume_size_in_gbs}"
cloudera_volume_size_in_gbs = "${var.cloudera_volume_size_in_gbs}"
block_volumes_per_worker = "${var.block_volumes_per_worker}"
data_blocksize_in_gbs = "${var.data_blocksize_in_gbs}"
user_data = "${base64encode(file("../scripts/boot.sh"))}"
user_data = "${base64encode(file("scripts/boot.sh"))}"
cloudera_manager = "cdh-utility-1.public${var.availability_domain}.${module.network.vcn-dn}"
cm_version = "${var.cm_version}"
cdh_version = "${var.cdh_version}"
block_volume_count = "${var.block_volumes_per_worker}"
deployment_type = "${var.deployment_type}"
}
File renamed without changes.
Binary file added images/RM_variables.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes.
3 changes: 0 additions & 3 deletions terraform/modules/bastion/main.tf → modules/bastion/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ resource "oci_core_instance" "Bastion" {
cloudera_manager = "${var.cloudera_manager}"
cdh_version = "${var.cdh_version}"
cm_version = "${var.cm_version}"
deployment_type = "${var.deployment_type}"
}

timeouts {
Expand All @@ -38,7 +37,6 @@ resource "oci_core_volume" "BastionLogVolume" {
resource "oci_core_volume_attachment" "BastionLogAttachment" {
count = "1"
attachment_type = "iscsi"
compartment_id = "${var.compartment_ocid}"
instance_id = "${oci_core_instance.Bastion.id}"
volume_id = "${oci_core_volume.BastionLogVolume.id}"
device = "/dev/oracleoci/oraclevdb"
Expand All @@ -56,7 +54,6 @@ resource "oci_core_volume" "BastionClouderaVolume" {
resource "oci_core_volume_attachment" "BastionClouderaAttachment" {
count = "1"
attachment_type = "iscsi"
compartment_id = "${var.compartment_ocid}"
instance_id = "${oci_core_instance.Bastion.id}"
volume_id = "${oci_core_volume.BastionClouderaVolume.id}"
device = "/dev/oracleoci/oraclevdc"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@

variable "region" {}
variable "compartment_ocid" {}
variable "private_key_path" {}
variable "ssh_public_key" {}
variable "ssh_private_key" {}
variable "instances" {}
Expand All @@ -16,7 +15,6 @@ variable "image_ocid" {}
variable "cm_version" {}
variable "cdh_version" {}
variable "cloudera_manager" {}
variable "deployment_type" {}

# ---------------------------------------------------------------------------------------------------------------------
# Optional variables
Expand Down Expand Up @@ -47,12 +45,6 @@ variable "bastion_instance_shape" {
default = "VM.Standard2.8"
}

# Path to SSH Key

variable "ssh_keypath" {
default = "/home/opc/.ssh/id_rsa"
}

# ---------------------------------------------------------------------------------------------------------------------
# Constants
# You probably don't need to change these.
Expand Down
4 changes: 0 additions & 4 deletions terraform/modules/master/main.tf → modules/master/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ resource "oci_core_instance" "Master" {
cloudera_manager = "${var.cloudera_manager}"
cdh_version = "${var.cdh_version}"
cm_version = "${var.cm_version}"
deployment_type = "${var.deployment_type}"
}

timeouts {
Expand All @@ -41,7 +40,6 @@ resource "oci_core_volume" "MasterLogVolume" {
resource "oci_core_volume_attachment" "MasterLogAttachment" {
count = "${var.instances}"
attachment_type = "iscsi"
compartment_id = "${var.compartment_ocid}"
instance_id = "${oci_core_instance.Master.*.id[count.index]}"
volume_id = "${oci_core_volume.MasterLogVolume.*.id[count.index]}"
device = "/dev/oracleoci/oraclevdb"
Expand All @@ -59,7 +57,6 @@ resource "oci_core_volume" "MasterClouderaVolume" {
resource "oci_core_volume_attachment" "MasterClouderaAttachment" {
count = "${var.instances}"
attachment_type = "iscsi"
compartment_id = "${var.compartment_ocid}"
instance_id = "${oci_core_instance.Master.*.id[count.index]}"
volume_id = "${oci_core_volume.MasterClouderaVolume.*.id[count.index]}"
device = "/dev/oracleoci/oraclevdc"
Expand All @@ -77,7 +74,6 @@ resource "oci_core_volume" "MasterNNVolume" {
resource "oci_core_volume_attachment" "MasterNNAttachment" {
count = "${var.instances}"
attachment_type = "iscsi"
compartment_id = "${var.compartment_ocid}"
instance_id = "${oci_core_instance.Master.*.id[count.index]}"
volume_id = "${oci_core_volume.MasterNNVolume.*.id[count.index]}"
device = "/dev/oracleoci/oraclevdd"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@

variable "region" {}
variable "compartment_ocid" {}
variable "private_key_path" {}
variable "ssh_public_key" {}
variable "ssh_private_key" {}
variable "instances" {}
Expand All @@ -16,7 +15,6 @@ variable "image_ocid" {}
variable "cm_version" {}
variable "cdh_version" {}
variable "cloudera_manager" {}
variable "deployment_type" {}

# ---------------------------------------------------------------------------------------------------------------------
# Optional variables
Expand Down Expand Up @@ -60,12 +58,6 @@ variable "master_instance_shape" {
default = "VM.Standard2.8"
}

# Path to SSH Key

variable "ssh_keypath" {
default = "/home/opc/.ssh/id_rsa"
}

# ---------------------------------------------------------------------------------------------------------------------
# Constants
# You probably don't need to change these.
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
6 changes: 3 additions & 3 deletions terraform/modules/utility/main.tf → modules/utility/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,9 @@ resource "oci_core_instance" "Utility" {
worker_shape = "${var.worker_shape}"
block_volume_count = "${var.block_volume_count}"
availability_domain = "${var.AD}"
deployment_type = "${var.deployment_type}"
secure_cluster = "${var.secure_cluster}"
hdfs_ha = "${var.hdfs_ha}"
cluster_name = "${var.cluster_name}"
}

extended_metadata {
Expand All @@ -47,7 +49,6 @@ resource "oci_core_volume" "UtilLogVolume" {
resource "oci_core_volume_attachment" "UtilLogAttachment" {
count = "1"
attachment_type = "iscsi"
compartment_id = "${var.compartment_ocid}"
instance_id = "${oci_core_instance.Utility.id}"
volume_id = "${oci_core_volume.UtilLogVolume.id}"
device = "/dev/oracleoci/oraclevdb"
Expand All @@ -65,7 +66,6 @@ resource "oci_core_volume" "UtilClouderaVolume" {
resource "oci_core_volume_attachment" "UtilClouderaAttachment" {
count = "1"
attachment_type = "iscsi"
compartment_id = "${var.compartment_ocid}"
instance_id = "${oci_core_instance.Utility.id}"
volume_id = "${oci_core_volume.UtilClouderaVolume.id}"
device = "/dev/oracleoci/oraclevdc"
Expand Down
File renamed without changes.
Loading

0 comments on commit 8d16552

Please sign in to comment.