AI in 5G wireless networks

I have been working in the telecom sector since the early days of fixed access networks, MPLS-based packet cores, GSM, 3G, 4G/LTE, and now the latest wave—5G & AI enhancement. A common question I hear from peers is: “What’s really new in 5G for voice calls, SMS, or even mobile apps?”

The reality is that 5G’s true potential goes far beyond consumer voice and data services. While end-users will experience incremental improvements, 5G is predominantly designed to unlock industrial-scale innovation—enabling use cases such as smart factories, autonomous systems, and large-scale IoT ecosystems.


AI and the Future of Telecom Networks

Artificial intelligence (AI) is emerging as a critical enabler in next-generation RAN deployments and new business models, including neutral host operators, massive IoT connectivity, and mission-critical IoT applications.

The telecom landscape is undergoing a seismic shift, driven by the convergence of 5G and AI. Networks are becoming increasingly complex, and intelligent automation is no longer optional—it is essential. Imagine a network that adapts in real time to user demand, optimizes resource usage instantly, and guarantees seamless connectivity. This is not a futuristic vision; it is already becoming reality.


Dynamic Networks with AI and Machine Learning

With AI at the helm, telecom operators are integrating advanced algorithms into both core and access networks. By continuously analyzing usage patterns, these systems dynamically tune network parameters to match demand.

  • Traffic Adaptation: When traffic surges, AI-driven orchestration automatically increases capacity, ensuring performance is maintained without manual intervention.
  • Resource Optimization: Idle bandwidth or underutilized resources are dynamically reallocated, improving efficiency across the network.
  • Predictive Assurance: AI models forecast potential service degradations before they escalate, enabling proactive intervention and guaranteeing service continuity.

The result is a network that is self-optimizing, resilient, and customer-centric, ultimately boosting user trust and satisfaction.


AI Use Case: Synthetic Alarms and Automated Optimization

In this blog series, I will walk through a practical use case where synthetic alarms are generated to simulate network anomalies. We will then build an AI/ML model that learns from these patterns and triggers corrective actions automatically—optimizing the network without human intervention.


Deploying Kubeflow for Telecom AI

To implement this, I am deploying Kubeflow, a powerful machine learning platform, on top of a Kubernetes cluster. Kubeflow provides the framework for building, training, and deploying ML models that can interact with live network telemetry and automate optimization loops.

For an on-premises Kubeflow deployment, the following infrastructure components are essential:

  • Kubernetes cluster (pre-installed)
  • NFS storage for persistence
  • Load balancers and routers for service access
  • DNS servers for service discovery

I have Kubernetes cluster with single master and 2 worker nodes running on HP DL 380 servers

[root@nfsweb manifests]# kubectl get nodes
NAME                           STATUS   ROLES            AGE   VERSION
kubemaster.ranjeetbadhe.com    Ready    control-plane    10d   v1.33.4
kubeworker1.ranjeetbadhe.com   Ready    mazdoor,worker   10d   v1.33.4
kubeworker2.ranjeetbadhe.com   Ready    mazdoor,worker   10d   v1.33.4

The first step is to install the Git command-line tool on my system. Once Git is available, I proceed to clone the kubesoft repository, which downloads all the source code and project history to my local environment.Install Git. Kubeflow is a Kubernetes-native MLOps platform that streamlines the entire machine learning lifecycle—training, tuning, deployment, and monitoring—on scalable, cloud-native infrastructure. Built on Kubernetes CRDs and operators, it integrates components like Pipelines for workflow automation, KServe for model serving, and Katib for hyperparameter optimization. Its modular, portable design enables reproducible ML at scale across on-prem, hybrid, and multi-cloud environments, making it a strategic choice for enterprise-grade AI deployments

[root@nfsweb ~]# dnf install git -y
Updating Subscription Management repositories.
Last metadata expiration check: 3:16:38 ago on Fri 19 Sep 2025 03:23:01 AM UTC.
Dependencies resolved.
==================================================================================================
 Package                                   Architecture                    Version                
==================================================================================================
Installing:
 git                                       x86_64                          2.47.3-1.el9_6         
Installing dependencies:
 git-core                                  x86_64                          2.47.3-1.el9_6         
 git-core-doc                              noarch                          2.47.3-1.el9_6         
 perl-Error                                noarch                          1:0.17029-7.el9        
 perl-Git                                  noarch                          2.47.3-1.el9_6         
 perl-TermReadKey                          x86_64                          2.38-11.el9            
 perl-lib                                  x86_64                          0.65-481.1.el9_6       

Transaction Summary
==================================================================================================
Install  7 Packages

[root@nfsweb ~]# git clone https://github.com/kubeflow/manifests.git
Cloning into 'manifests'...
remote: Enumerating objects: 34742, done.
remote: Counting objects: 100% (648/648), done.
remote: Compressing objects: 100% (456/456), done.
remote: Total 34742 (delta 422), reused 192 (delta 192), pack-reused 34094 (from 3)
Receiving objects: 100% (34742/34742), 47.33 MiB | 6.98 MiB/s, done.
Resolving deltas: 100% (20657/20657), done.

[root@nfsweb ~]# cd manifests/
[root@nfsweb manifests]# while ! kustomize build example | kubectl apply --server-side --force-conflicts -f -; do   echo "Retrying to apply resources";   sleep 20; done
# Warning: 'vars' is deprecated. Please use 'replacements' instead. [EXPERIMENTAL] Run 'kustomize edit fix' to update your Kustomization automatically.
# Warning: 'patchesStrategicMerge' is deprecated. Please use 'patches' instead. Run 'kustomize edit fix' to update your Kustomization automatically.

validatingwebhookconfiguration.admissionregistration.k8s.io/pvcviewer-validating-webhook-configuration serverside-applied
validatingwebhookconfiguration.admissionregistration.k8s.io/servingruntime.serving.kserve.io serverside-applied
validatingwebhookconfiguration.admissionregistration.k8s.io/spark-operator-webhook serverside-applied
validatingwebhookconfiguration.admissionregistration.k8s.io/trainedmodel.serving.kserve.io serverside-applied
validatingwebhookconfiguration.admissionregistration.k8s.io/validation.webhook.serving.knative.dev serverside-applied
validatingwebhookconfiguration.admissionregistration.k8s.io/validator.trainer.kubeflow.org serverside-applied
[root@nfsweb manifests]#

[root@nfsweb manifests]# kubectl get ns
NAME                        STATUS   AGE
argocd                      Active   4h21m
auth                        Active   34m
calico-apiserver            Active   10d
calico-system               Active   10d
cert-manager                Active   34m
default                     Active   10d
istio-system                Active   34m
knative-serving             Active   34m
kube-node-lease             Active   10d
kube-public                 Active   10d
kube-system                 Active   10d
kubeflow                    Active   34m
kubeflow-system             Active   34m
kubeflow-user-example-com   Active   18m
nfs-provisioner             Active   9d
oauth2-proxy                Active   34m
tigera-operator             Active   10d

[root@nfsweb manifests]# kubectl get ns --sort-by=.metadata.creationTimestamp \
| awk 'NR==1{print; next} {a[NR]=$0} END{for(i=NR;i>1;i--) print a[i]}'
NAME                        STATUS   AGE
kubeflow-user-example-com   Active   22m
auth                        Active   38m
oauth2-proxy                Active   38m
istio-system                Active   38m
kubeflow-system             Active   38m
kubeflow                    Active   38m
knative-serving             Active   38m
cert-manager                Active   38m
argocd                      Active   4h25m
nfs-provisioner             Active   9d
calico-system               Active   10d
calico-apiserver            Active   10d
tigera-operator             Active   10d
kube-node-lease             Active   10d
default                     Active   10d
kube-public                 Active   10d
kube-system                 Active   10d

Here is a privildge pod security shell script which will find out on which namespaces have  ISTIO injection enabled on them

on those namespaces it adds 2 node labels which allows them to have extra permissions.

[root@nfsweb manifests]# vi security.sh
[root@nfsweb manifests]# chmod +x security.sh
[root@nfsweb manifests]# cat security.sh
#!/usr/bin/env bash
set -euo pipefail

# Find all namespaces with Istio injection enabled
namespaces=$(kubectl get ns \
  -l istio-injection=enabled \
  -o jsonpath='{.items[*].metadata.name}')

if [[ -z "$namespaces" ]]; then
  echo "No namespaces found with istio-injection=enabled"
  exit 1
fi

for ns in $namespaces; do
  echo "→ Updating PodSecurity on namespace \"$ns\""
  kubectl label namespace "$ns" \
    pod-security.kubernetes.io/enforce=privileged \
    pod-security.kubernetes.io/enforce-version=latest \
    --overwrite

  echo "→ Restarting all deployments in \"$ns\""
  kubectl rollout restart deployment -n "$ns"
done

echo "✅ Done."

Kubernetes base AI

[root@nfsweb manifests]# sh security.sh
→ Updating PodSecurity on namespace "knative-serving"
namespace/knative-serving labeled
→ Restarting all deployments in "knative-serving"
deployment.apps/activator restarted
deployment.apps/autoscaler restarted
deployment.apps/controller restarted
deployment.apps/net-istio-controller restarted
deployment.apps/net-istio-webhook restarted
deployment.apps/webhook restarted
→ Updating PodSecurity on namespace "kubeflow"
namespace/kubeflow labeled
→ Restarting all deployments in "kubeflow"
deployment.apps/admission-webhook-deployment restarted
deployment.apps/cache-server restarted
deployment.apps/centraldashboard restarted
deployment.apps/jupyter-web-app-deployment restarted
deployment.apps/katib-controller restarted
deployment.apps/katib-db-manager restarted
deployment.apps/katib-mysql restarted
deployment.apps/katib-ui restarted
deployment.apps/kserve-controller-manager restarted
deployment.apps/kserve-localmodel-controller-manager restarted
deployment.apps/kserve-models-web-app restarted
deployment.apps/kubeflow-pipelines-profile-controller restarted
deployment.apps/metadata-envoy-deployment restarted
deployment.apps/metadata-grpc-deployment restarted
deployment.apps/metadata-writer restarted
deployment.apps/minio restarted
deployment.apps/ml-pipeline restarted
deployment.apps/ml-pipeline-persistenceagent restarted
deployment.apps/ml-pipeline-scheduledworkflow restarted
deployment.apps/ml-pipeline-ui restarted
deployment.apps/ml-pipeline-viewer-crd restarted
deployment.apps/ml-pipeline-visualizationserver restarted
deployment.apps/mysql restarted
deployment.apps/notebook-controller-deployment restarted
deployment.apps/profiles-deployment restarted
deployment.apps/pvcviewer-controller-manager restarted
deployment.apps/seaweedfs restarted
deployment.apps/spark-operator-controller restarted
deployment.apps/spark-operator-webhook restarted
deployment.apps/tensorboard-controller-deployment restarted
deployment.apps/tensorboards-web-app-deployment restarted
deployment.apps/volumes-web-app-deployment restarted
deployment.apps/workflow-controller restarted
→ Updating PodSecurity on namespace "kubeflow-system"
namespace/kubeflow-system labeled
→ Restarting all deployments in "kubeflow-system"
deployment.apps/jobset-controller-manager restarted
deployment.apps/kubeflow-trainer-controller-manager restarted
→ Updating PodSecurity on namespace "kubeflow-user-example-com"
namespace/kubeflow-user-example-com labeled
→ Restarting all deployments in "kubeflow-user-example-com"
No resources found in kubeflow-user-example-com namespace.
✅ Done.

istio-ingressgateway is created ,next we configure Metallb load balancer for external access.

[root@nfsweb manifests]# kubectl get svc -n istio-system
NAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                 AGE
cluster-local-gateway   ClusterIP   10.111.195.133          15020/TCP,80/TCP                        131m
istio-ingressgateway    ClusterIP   10.96.228.44            15021/TCP,80/TCP,443/TCP                131m
istiod                  ClusterIP   10.104.238.195          15010/TCP,15012/TCP,443/TCP,15014/TCP   131m
knative-local-gateway   ClusterIP   10.105.185.81            80/TCP,443/TCP                          131m

Accessing the applications ,lets configure 

[root@nfsweb manifests]#  kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.8/config/manifests/metallb-native.yaml
namespace/metallb-system created
customresourcedefinition.apiextensions.k8s.io/bfdprofiles.metallb.io created
customresourcedefinition.apiextensions.k8s.io/bgpadvertisements.metallb.io created
customresourcedefinition.apiextensions.k8s.io/bgppeers.metallb.io created
customresourcedefinition.apiextensions.k8s.io/communities.metallb.io created
customresourcedefinition.apiextensions.k8s.io/ipaddresspools.metallb.io created
customresourcedefinition.apiextensions.k8s.io/l2advertisements.metallb.io created
customresourcedefinition.apiextensions.k8s.io/servicel2statuses.metallb.io created
serviceaccount/controller created
serviceaccount/speaker created
role.rbac.authorization.k8s.io/controller created
role.rbac.authorization.k8s.io/pod-lister created
clusterrole.rbac.authorization.k8s.io/metallb-system:controller created
clusterrole.rbac.authorization.k8s.io/metallb-system:speaker created
rolebinding.rbac.authorization.k8s.io/controller created
rolebinding.rbac.authorization.k8s.io/pod-lister created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker created
configmap/metallb-excludel2 created[root@nfsweb manifests]#  kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.8/config/manifests/metallb-native.yaml
namespace/metallb-system created
customresourcedefinition.apiextensions.k8s.io/bfdprofiles.metallb.io created
customresourcedefinition.apiextensions.k8s.io/bgpadvertisements.metallb.io created
customresourcedefinition.apiextensions.k8s.io/bgppeers.metallb.io created
customresourcedefinition.apiextensions.k8s.io/communities.metallb.io created
customresourcedefinition.apiextensions.k8s.io/ipaddresspools.metallb.io created
customresourcedefinition.apiextensions.k8s.io/l2advertisements.metallb.io created
customresourcedefinition.apiextensions.k8s.io/servicel2statuses.metallb.io created
serviceaccount/controller created
serviceaccount/speaker created
role.rbac.authorization.k8s.io/controller created
role.rbac.authorization.k8s.io/pod-lister created
clusterrole.rbac.authorization.k8s.io/metallb-system:controller created
clusterrole.rbac.authorization.k8s.io/metallb-system:speaker created
rolebinding.rbac.authorization.k8s.io/controller created
rolebinding.rbac.authorization.k8s.io/pod-lister created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:controller created
clusterrolebinding.rbac.authorization.k8s.io/metallb-system:speaker created
configmap/metallb-excludel2 created
secret/metallb-webhook-cert created
service/metallb-webhook-service created
deployment.apps/controller created
daemonset.apps/speaker created
validatingwebhookconfiguration.admissionregistration.k8s.io/metallb-webhook-configuration created
[root@nfsweb manifests]# kubectl apply -f metal-lb-manifest.yaml
ipaddresspool.metallb.io/default-pool created
l2advertisement.metallb.io/default-l2 created

secret/metallb-webhook-cert created
service/metallb-webhook-service created
deployment.apps/controller created
daemonset.apps/speaker created
validatingwebhookconfiguration.admissionregistration.k8s.io/metallb-webhook-configuration created

[root@nfsweb manifests]# cat metal-lb-manifest.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: default-pool
  namespace: metallb-system
spec:
  addresses:
  - 192.168.0.220-192.168.0.240
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: default-l2
  namespace: metallb-system
spec:
  ipAddressPools:
  - default-pool

#
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2025-09-19T06:57:05Z"
  labels:
    app: istio-ingressgateway
    app.kubernetes.io/instance: istio
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: istio-ingressgateway
    app.kubernetes.io/part-of: istio
    app.kubernetes.io/version: 1.27.0
    helm.sh/chart: istio-ingress-1.27.0
    install.operator.istio.io/owning-resource: unknown
    istio: ingressgateway
    istio.io/rev: default
    operator.istio.io/component: IngressGateways
    release: istio
  name: istio-ingressgateway
  namespace: istio-system
  resourceVersion: "293386"
  uid: be5a6a8c-ac11-4090-bd5a-4b26bc02d71d
spec:
  clusterIP: 10.96.228.44
  clusterIPs:
  - 10.96.228.44
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: status-port
    port: 15021
    protocol: TCP
    targetPort: 15021
  - name: http2
    port: 80
    protocol: TCP
    targetPort: 8080
  - name: https
    port: 443
    protocol: TCP
    targetPort: 8443
  selector:
    app: istio-ingressgateway
    istio: ingressgateway
  sessionAffinity: None
  type: ClusterIP (Changed to LoadBalancer)
status:
  loadBalancer: {}

Configuration changed to LoadBalancer from Cluster

apiVersion: v1
kind: Service
metadata:
  annotations:
    metallb.universe.tf/ip-allocated-from-pool: default-pool
  creationTimestamp: "2025-09-19T06:57:05Z"
  labels:
    app: istio-ingressgateway
    app.kubernetes.io/instance: istio
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: istio-ingressgateway
    app.kubernetes.io/part-of: istio
    app.kubernetes.io/version: 1.27.0
    helm.sh/chart: istio-ingress-1.27.0
    install.operator.istio.io/owning-resource: unknown
    istio: ingressgateway
    istio.io/rev: default
    operator.istio.io/component: IngressGateways
    release: istio
  name: istio-ingressgateway
  namespace: istio-system
  resourceVersion: "385322"
  uid: be5a6a8c-ac11-4090-bd5a-4b26bc02d71d
spec:
  allocateLoadBalancerNodePorts: true
  clusterIP: 10.96.228.44
  clusterIPs:
  - 10.96.228.44
  externalTrafficPolicy: Cluster
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: status-port
    nodePort: 30556
    port: 15021
    protocol: TCP
    targetPort: 15021
  - name: http2
    nodePort: 32547
    port: 80
    protocol: TCP
    targetPort: 8080
  - name: https
    nodePort: 30103
    port: 443
    protocol: TCP
    targetPort: 8443
  selector:
    app: istio-ingressgateway
    istio: ingressgateway
  sessionAffinity: None
  type: LoadBalancer
status:
  loadBalancer:
    ingress:
    - ip: 192.168.0.220
      ipMode: VIP

Changing the Kubeflow settings

#
apiVersion: networking.istio.io/v1
kind: Gateway
metadata:
  creationTimestamp: "2025-09-19T06:58:09Z"
  generation: 1
  name: kubeflow-gateway
  namespace: kubeflow
  resourceVersion: "295072"
  uid: 099a06a8-d72c-467b-9769-cee2f03259b1
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP

# HTTPS listener
    port:
      number: 443
      name: https
      protocol: HTTPS
    tls:
      mode: SIMPLE
      credentialName: kubeflow-ingressgateway-certs
    hosts:
      - "*"

Loadbalancer

[root@nfsweb manifests]#  kubectl get svc -n istio-system
NAME                    TYPE           CLUSTER-IP       EXTERNAL-IP     PORT(S)                                      AGE
cluster-local-gateway   ClusterIP      10.111.195.133            15020/TCP,80/TCP                             3h31m
istio-ingressgateway    LoadBalancer   10.96.228.44     192.168.0.220   15021:30556/TCP,80:32547/TCP,443:30103/TCP   3h31m
istiod                  ClusterIP      10.104.238.195            15010/TCP,15012/TCP,443/TCP,15014/TCP        3h31m
knative-local-gateway   ClusterIP      10.105.185.81              80/TCP,443/TCP                               3h31m

We got the external ip 192.168.0.220 for access.

Accessing the GUI

 

. With the deployment complete, our next episode will dive into the exciting part—building, training, and deploying machine learning models that can actively interact with live network telemetry. These models will process SNMP alarms and data flowing in through southbound protocols to enable automated, closed-loop optimization. To illustrate the concept clearly, we’ll start with a synthetic alarm as our use case before scaling toward real-world scenarios

Kubeflow Cluster Deployments – With Namespaces and Purpose

NamespaceDeploymentPurpose
authdexOIDC identity provider used for authentication (e.g., GitHub, LDAP, Google).
oauth2-proxyoauth2-proxyActs as a reverse proxy, handling OAuth2-based login and token exchange for web UIs.
cert-managercert-managerManages TLS certificates (e.g., from Let’s Encrypt) for secure ingress.
cert-managercert-manager-cainjectorInjects CA data into webhook configurations automatically.
cert-managercert-manager-webhookHandles dynamic admission control for cert-manager resources.
defaultnfs-provisioner-nfs-subdir-external-provisionerDynamic NFS-based storage provisioning using sub-directories.
istio-systemcluster-local-gatewayIstio gateway for internal-only traffic between services.
istio-systemistio-ingressgatewayExternal-facing Istio gateway handling ingress traffic.
istio-systemistiodIstio control plane: manages sidecars, traffic rules, certificates.
knative-servingactivatorBuffers requests for scale-to-zero services until pods are ready.
knative-servingautoscalerMonitors traffic and scales Knative services up/down.
knative-servingcontrollerReconciles Knative Serving CRDs like Revision, Service.
knative-servingnet-istio-controllerIntegrates Knative networking with Istio for traffic routing.
knative-servingnet-istio-webhookAdmission webhook for validating Istio networking resources.
knative-servingwebhookValidates and mutates Knative resources at creation.
kube-systemcorednsInternal DNS server for Kubernetes service discovery.
kubeflow-user-example-comml-pipeline-ui-artifactUI to view pipeline artifacts.
kubeflow-user-example-comml-pipeline-visualizationserverRenders charts/visuals of metrics during pipeline run.
kubeflowadmission-webhook-deploymentValidates resources like notebooks before they’re created.
kubeflowcache-serverCaches pipeline steps to avoid redundant executions.
kubeflowcentraldashboardThe main UI for accessing Kubeflow features.
kubeflowjupyter-web-app-deploymentUI for managing and spawning Jupyter notebooks.
kubeflowkatib-controllerManages experiment lifecycle.
kubeflowkatib-db-managerSeeds Katib DB schema and manages connections.
kubeflowkatib-mysqlMySQL database to store experiments and trials.
kubeflowkatib-uiWeb UI to launch and view Katib experiments.
kubeflowkserve-controller-managerMain controller for managing InferenceService CRDs.
kubeflowkserve-localmodel-controller-managerFor serving models from local storage.
kubeflowkserve-models-web-appWeb UI to manage deployed models.
kubeflowkubeflow-pipelines-profile-controllerTies pipelines with user profiles and RBAC.
kubeflowmetadata-envoy-deploymentSidecar proxy for metadata API.
kubeflowmetadata-grpc-deploymentgRPC API server for metadata tracking.
kubeflowmetadata-writerWrites pipeline metadata for lineage tracking.
kubeflowminioS3-compatible object store used to store pipeline artifacts, models, etc.
kubeflowml-pipelineOrchestrates pipeline execution and lifecycle.
kubeflowml-pipeline-persistenceagentPersists pipeline runs and metadata to DB.
kubeflowml-pipeline-scheduledworkflowHandles scheduled/cron-based pipeline runs.
kubeflowml-pipeline-uiWeb UI to browse and run ML pipelines.
kubeflowml-pipeline-viewer-crdManages CRD for artifact viewer and rendering logic.
kubeflowml-pipeline-visualizationserverRenders charts/visuals of metrics during pipeline run.
kubeflowmysqlRelational DB backend used by Katib for experiment metadata.
kubeflownotebook-controller-deploymentController for managing notebook CRDs (spawns pods).
kubeflowprofiles-deploymentManages user profiles, namespaces, and isolation.
kubeflowpvcviewer-controller-managerRenders UI to view contents of PVCs in the dashboard.
kubeflowspark-operator-controllerManages Spark applications and jobs on K8s.
kubeflowspark-operator-webhookValidates Spark job submissions.
kubeflowtensorboard-controller-deploymentManages tensorboard instances tied to experiments.
kubeflowtensorboards-web-app-deploymentUI for launching and browsing tensorboards.
kubeflowtraining-operatorCustom controller for training jobs (TFJob, PyTorchJob, etc.).
kubeflowvolumes-web-app-deploymentWeb UI to manage PVCs in the user’s namespace.
kubeflowworkflow-controllerArgo controller that runs workflows in Kubernetes.
metallb-systemcontrollerManages allocation of external IPs for services using MetalLB.

Leave a Reply

Your email address will not be published.