This is a kind of documentation or walks thought of my work, which can be called a DevSecOps diary.

Lesson Learned With Windows Containers Orchestration in Kubernetes With AWS EKS

windows container

Containers, Containers, Containers! We see them everywhere now, Containers revolutionize the software development world. We see now microservice architecture popularities and dockerized apps everywhere, After Microsoft released Asp.net Core, .Net apps can run in Linux too, which means it can take advantage of fascinating Kubernetes. .Net 1.1 was released in 2002 and after it became popular lots of applications built on it over the decades. Not all applications can not be rewritten overnight. In the meantime windows docker also becomes available. Windows Server, version 2004 Available in 2020 which is a container optimized server version of windows.

So, for companies who have used a long time Microsoft .Net framework for a long time and cannot change their entire code base overnight, it’s worth looking into windows containers to ship software in a similar way, they adapted with Linux containers. With this motivation, I was exploring window containers. I have noted down the lesson learned, hacks difficulties here, If someone else also in this migration process might help. All of the notes are context-specific to the time, could be changed over time, the time frame is mid-2020 to early 2021.

Unlike, Linux docker windows docker cannot run cross kernel means, if you use windows 2019 you have to use ltsc2019 (LTSC)

docker pull mcr.microsoft.com/windows/servercore:ltsc2019

If you want to go with windows server 2004 (it’s a confusing name it is not the year 2004)

docker pull mcr.microsoft.com/windows/servercore:2004

Here is a sample dockerfile I have been using, with PowerShell you have the power to create IIS sites adding remove windows features whatever you want. Nothing noticeable interesting here? If you play with windows docker already you probably notice one important line is missing! Where is ServiceMonitor.exe w3svc?


 FROM mcr.microsoft.com/windows/servercore/iis:windowsservercore-2004
 WORKDIR /inetpub/wwwroot/MerchantTranslation
 COPY \\bin\\Release\\ /inetpub/wwwroot/appsite   
 RUN powershell -NoProfile -Command \
Install-WindowsFeature NET-Framework-45-ASPNET ; \
Install-WindowsFeature Web-Asp-Net45 ; \
 Import-Module WebAdministration ;\
Remove-WebSite -Name 'Default Web Site' ; \
New-IISSite -Name 'MerchantTranslationSite' -PhysicalPath 'C:\inetpub\wwwroot' -BindingInformation "*:8080:" ; \ 
New-WebVirtualDirectory -Site appsite -Name appsite -PhysicalPath 'C:\inetpub\wwwroot' ;\
New-WebApplication -Name MerchantTranslation -Site 'MerchantTranslationSite' -PhysicalPath 'C:\inetpub\wwwroot\appsite -Force 
EXPOSE 8080 

Yes!, You notice correctly I do not include the ServiceMonitor.exe w3svc in the docker file. Let me bring the trivial requirements here, we want to build docker once and want to use all the environment SIT, QA, UAT, PROD. The web. config and other settings files will be different in each environment. So you can put your configuration in config map or secrets and copy that in the web directory during docker bootstrap. So here the trick you can mount your config and copying to the destination then start ServiceMonitor.

containers:
 - name: {{ .Chart.Name }}
   securityContext:
     {{- toYaml .Values.securityContext | nindent 12 }}
   image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
   imagePullPolicy: {{ .Values.image.pullPolicy }}
   args: {{ .Values.image.args }}
   command:
     - "powershell.exe"
     - "-Command"
     - "cp c:\\app\\secret-volume\\web.config Web.config"
     - ";mkdir c:\\cert"
     - ";cp c:\\app\\secret-volume\\*.cer c:\\cert\\"
     - ";cp c:\\app\\secret-volume\\*.pfx c:\\cert\\"
     - ";C:\\ServiceMonitor.exe"
     - "w3svc"

      volumes:
        - name: {{ template "service.name" . }}
          configMap:
            name: {{ template "service.fullname" . }}
            defaultMode: 0755
        {{- if .Values.secretvolume.enabled }}
        - name: secret-volume
          projected:
            sources:
              {{- range $key, $val := .Values.secretfiles }}
              - secret:
                  name: {{ $val }}
              {{- end }}
        {{- end }}

ServiceMonitor is currently distributed as part of the IIS, ASP.NET, and WCF images on DockerHub. We recommend layering your project on top of those official images as running ServiceMonitor directly in your Dockerfile.

Notes for ECS users. while ($true) { Start-Sleep -Seconds 3600 } > InitializeContainer.ps1

AWS EKS in general support windows with few considerations. To Enabling windows in EKS With you need to some additional steps that are not needed in the Linux cluster, You need to deploy the VPC resource controller to your cluster and Create the VPC admission controller webhook. There are well documented.

Scaling issue and workaround

I have deployed my cluster in AWS with EKS, I have faced few issues with EKS while scaling. Initially, I started with the windows 2019 server and EKS support at that moment max k8s 1.18.

I have some mission-critical service, I need to scale a bit more than usual, In my load test environment, I had scaled with 100 pods behind a service.

Alas!!! Suddenly found in AWS alb become unhealthy.

alb unhealthy

After a couple of double-checking everything, I found after 64 pods( Why this magic number 64 not sure.) the load balancer all targets become unhealthy. This is something not related to my settings, I was created an AWS support bug, and I learned a couple of new things. AWS support team suggest I enabled WinDSR with the following script.

[string]$EKSBinDir = "$env:ProgramFiles\Amazon\EKS"
[string]$EKSBootstrapScriptName = 'Start-EKSBootstrap.ps1'
[string]$EKSBootstrapScriptFile = "$EKSBinDir\$EKSBootstrapScriptName"
(Get-Content $EKSBootstrapScriptFile).replace('"--proxy-mode=kernelspace",', '"--proxy-mode=kernelspace", "--feature-gates WinDSR=true", "--enable-dsr",') | Set-Content $EKSBootstrapScriptFile

Well, What is WinDSR? Study time.

Direct Server Return

DSR is an implementation of asymmetric network load distribution in load-balanced systems, meaning that the request and response traffic uses a different network path. Read More

 From official k8s documentation

windows container service

Hmm, Looks like it only supports from k8s 1.19, However, at that moment EKS still does not release 1.19, but they claim with 1.18 should be supported with WinDSR, I found no luck, So these two tickets born.

Issue Windows nodes behind Loadbalancer going OutOfService when the no. of pods is 65 or more

Issue Windows nodes behind Loadbalancer going OutOfService when the no. of pods is 65 or more

Lucky we are, AWS also released AWS Load Balancer Controller which supports multiple targets, So we can scale at least 64*5 times at least and we continue with that for a while.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: olaservice
  namespace: external
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/tags: Environment=qa,Team=awesome
    alb.ingress.kubernetes.io/certificate-arn: arn<sudo>
    alb.ingress.kubernetes.io/subnets : subnet-<>,subnet-<>,subnet-<>
    alb.ingress.kubernetes.io/actions.forward-multiple-tg: >
      {"type":"forward","forwardConfig":{"targetGroups":[{"serviceName":"olaservice-set1-service","servicePort":6001,"weight":25},{"serviceName":"olaservice-set2-service","servicePort":6001,"weight":25},{"serviceName":"olaservice-set3-service","servicePort":6001,"weight":25},{"serviceName":"olaservice-set4-service","servicePort":6001,"weight":25}],"targetGroupStickinessConfig":{"enabled":false,"durationSeconds":60}}}

spec:
  rules:
    - host: qa-olaservice.<>.com
      http:
        paths:
          - path: /*
            backend:
              serviceName: forward-multiple-tg
              servicePort: use-annotation



Notes

Now EKS release k8s1.19 and this issue is resolved, meaning you can scale more than 64 by enabling WinDSR

More Cloud provider issues

Next, we fall on another  EKS AMI bug while setting up cluster_service_ipv4_cidr, we found our DNSClusterIP was not changed, AWS support team is helpful and they gave us a way around, I’m mentioning this not to say AWS or EKS has bug or issue, I’m talking about that because working with technology which is premature and very few people used then there are so many use case remains to ignore and there is always have a chance to fall on a trap you cannot solve by yourself. You need a lot’s courage and mentality to accept failure or discover a sudden limitation that might beyond expectation.

Although windows docker images size is still huge and windows container support in Kubernetes has some limitations is it worth considering, if those limitations are not crucial for your use case, As your organization may already adapt the Linux container and if you are struggling with legacy apps still need to manage with EC2. In these case, if you convert to windows docker all of your CI/CD pipelines and other stacks will become similar. You can buy some time to deprecated your legacy .Net apps.