Cloudera Data Engineering (CDE) on ECS
This article explains the steps to deploy the CDE service on the ECS platform after successful configuration of the CDP PvC Data Services Management Console.
CDE Deployment
- In CM, navigate to - Data Services. Click- Open CDP Private Cloud Data Services. 
- The browser will redirect to the following page. Click - Data Engineering. 
- At the CDE main portal, you may enable the CDE service. Click - Enable CDE Service. 
- Fill in the fields below and click - Enable. 
- Next, you may create a new virtual cluster. Click - Create DE Cluster. 
- Fill in the fields below and click - Create. 
- The virtual CDE cluster is ready to run the Spark/Airflow job.  
- Register the user credential in the form of K8s secret object into the CDE virtual cluster namespace. - # ./cdp-cde-utils.sh init-user-in-virtual-cluster -h p2dmnmzb.cde-4c9twhtd.apps.ecs1.cdpkvm.cldr -u ldapuser2 -p ldapuser2.principal -k ldapuser2.keytab
- You may also run the CDE job using - cdeCLI. The- cdetool can be downloaded via the CDE virtual cluster landing page. Filename- /root/credentialsstores the password of the user.- # cat .cde/config.yaml user: ldapuser2 auth-pass-file: /root/credentials vcluster-endpoint: https://p2dmnmzb.cde-4c9twhtd.apps.ecs1.cdpkvm.cldr/dex/api/v1 tls-insecure: true # ./cde job create --type spark --application-file spark_wordcount.py --mount-1-resource resource1 --driver-cores 1 --driver-memory 4g --num-executors 0 --name wordcountjob --log-level DEBUG # ./cde job run --name wordcountjob
CDE Artifacts inside ECS Platform
# kubectl get ns | head -1 ; kubectl get ns | grep dex
NAME                                     STATUS   AGE
dex-app-nqjfkfb2                         Active   25m
dex-base-ggmgt8m4                        Active   30m
# kubectl -n dex-app-nqjfkfb2 get pods
NAME                                                 READY   STATUS    RESTARTS   AGE
dex-app-nqjfkfb2-airflow-scheduler-79bb7fcc9-2nt5k   1/1     Running   0          31m
dex-app-nqjfkfb2-airflow-web-68bbb47bc8-mqk66        1/1     Running   0          31m
dex-app-nqjfkfb2-airflowapi-6758987794-tplfn         2/2     Running   2          31m
dex-app-nqjfkfb2-api-6cb85f94b9-qmj5b                1/1     Running   0          31m
dex-app-nqjfkfb2-livy-564c8b45c8-4r4ng               1/1     Running   0          31m
dex-app-nqjfkfb2-safari-77fb94577-whrjd              1/1     Running   0          31m
# kubectl -n dex-app-nqjfkfb2 get pvc
NAME                             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
airflow-dags                     Bound    pvc-1a57ffb4-92dc-4c03-a958-38702549ceb1   100Gi      RWX            longhorn-nfs   28m
airflow-logs                     Bound    pvc-d424fdf6-2035-4418-8d95-03769926a069   100Gi      RWX            longhorn-nfs   28m
dex-app-nqjfkfb2-livystate-pvc   Bound    pvc-b9088e9a-1fb6-42a8-9e34-5135f0e1ce07   100Gi      RWX            longhorn-nfs   28m
dex-app-nqjfkfb2-safari-pvc      Bound    pvc-e565af81-424a-4e2c-8b32-ade212159492   100Gi      RWX            longhorn-nfs   28m
dex-app-nqjfkfb2-storage-pvc     Bound    pvc-af9ad8cb-b069-45cd-8338-97351ba0bacd   100Gi      RWX            longhorn-nfs   28m
# kubectl -n dex-base-ggmgt8m4  get pods
NAME                                            READY   STATUS    RESTARTS   AGE
cdp-cde-embedded-db-0                           1/1     Running   0          34m
dex-base-configs-manager-686d55b995-992nl       2/2     Running   0          34m
dex-base-dex-downloads-5fb84f65c6-sxqj6         1/1     Running   0          34m
dex-base-ggmgt8m4-controller-6d6c7d598b-79rh9   1/1     Running   0          34m
dex-base-grafana-67d95886cf-kcjpl               1/1     Running   0          34m
dex-base-knox-5d4b8fd79d-f7nxz                  1/1     Running   0          34m
dex-base-management-api-5f76b698f-hqmss         1/1     Running   4          34m
fluentd-forwarder-6747b5b567-bmv5x              1/1     Running   0          34m
# kubectl -n dex-base-ggmgt8m4  get pvc
NAME               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
dex-base-db-pvc    Bound    pvc-a12be944-e529-4f23-afad-d0ec58fb9677   100Gi      RWO            longhorn       34m
dex-base-grafana   Bound    pvc-64a82154-e511-4262-960b-92b4be27d631   10Gi       RWO            longhorn       34m
# kubectl get sc
NAME                 PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path           rancher.io/local-path   Delete          WaitForFirstConsumer   false                  3d12h
longhorn (default)   driver.longhorn.io      Delete          Immediate              true                   3d12h
longhorn-nfs         nfs.longhorn.io         Delete          Immediate              false                  3d12h