Skip to main content
Version: 9.3

Set Concurrent Kubernetes Lake Garbage Collector Job Limits

For AWS, Qrvey 9.3 deployments include a Kubernetes controller (Kueue) to manage the number of Lake Garbage Collector jobs that can run concurrently.

Overview

New deployments automatically enable the job control mechanism for Lake Garbage Collector jobs. The Kueue executes Garbage Collector jobs in the order they arrive in the queue, with a default limit of five concurrent pods (jobs).

Customers can request DevOps to add data sync execution jobs to a similar queue.

Before You Begin

Before setting up job controls, verify the following items:

  • You have kubectl access to the Kubernetes cluster.
  • The qrveyapps-jobs namespace is available with the Kueue controller installed.
  • The ClusterQueue resource qrvey-jobs-cluster-queue-lakegc exists in the qrveyapps-jobs namespace.

Change the Concurrent Job Limit

The ClusterQueue's nominalQuota for pods determines how many pods (jobs) can run at the same time.

  1. Patch the ClusterQueue resource by running the following command, replacing <value> with the desired number:

    kubectl patch clusterqueue qrvey-jobs-cluster-queue-lakegc --type json -p='[
    {
    "op": "replace",
    "path": "/spec/resourceGroups/0/flavors/0/resources/2/nominalQuota",
    "value": "<value>"
    }
    ]'

    For example, to set the limit to 10 concurrent jobs:

    kubectl patch clusterqueue qrvey-jobs-cluster-queue-lakegc --type json -p='[
    {
    "op": "replace",
    "path": "/spec/resourceGroups/0/flavors/0/resources/2/nominalQuota",
    "value": "10"
    }
    ]'
  2. Verify the change by inspecting the ClusterQueue resource:

    kubectl get clusterqueue qrvey-jobs-cluster-queue-lakegc -o yaml

Configure LGC Job Settings

LGC jobs are triggered from pods managed by the qrvey-job-manager deployment. To modify job settings, update the environment variables in that deployment.

VariableDefaultDescription
LAKEGC_QUEUE_NAMEqrvey-lakegc-queueName of the local queue where LGC jobs are sent.
LAKEGC_TTL_SECONDS_AFTER_FINISHED5Number of seconds to keep a completed LGC Kubernetes Job after it finishes before Kubernetes automatically deletes it. Use this variable to control how long finished jobs remain available for inspection, logs, and debugging before cleanup.
LAKEGC_JOB_CPU_REQUEST100mKubernetes CPU request assigned to the LGC job container. Defines the minimum CPU the scheduler reserves for the job.
LAKEGC_JOB_MEM_REQUEST256MiKubernetes memory request assigned to the LGC job container. Defines the minimum memory the scheduler reserves for the job.
LAKEGC_JOB_CPU_LIMIT500mKubernetes CPU limit assigned to the LGC job container. Defines the maximum CPU the container is allowed to use at runtime.
LAKEGC_JOB_MEM_LIMIT512MiKubernetes memory limit assigned to the LGC job container. Defines the maximum memory the container is allowed to use at runtime.

Additional Resources