Why is audio transcription useful?
Common use-cases for transcribing audio could be a bot that summarises customer complaints during a Zoom call, collects negative product feedback from reviews on YouTube, or that generates a set of timestamps for YouTube videos, which are later attached via API. You could even take traditional voice or VoIP recordings from a customer service center, and transcribe each one to look for training issues or high performing telephone agents. If you listen to podcasts on a regular basis and have ever read the show notes, they could have been generated by a transcription model.
GPU is generally faster than CPU, but CPU can also be very effective if you are able to batch up requests via the OpenFaaS Asynchronous invocations system, and collect the results later on. To collect results from async invocations, you can supply a callback URL to the initial request, or have the function store its result in S3. We have some tutorials in the conclusion that show this approach for other use-cases like PDF generation.
Here’s what we’ll cover:
Kubernetes has support for managing GPUs across different nodes using device plugins. The setup in your cluster will depend on your platform and GPU vendor. We will be setting up a k3s cluster with NVIDIA container runtime support.
k3sup is a light-weight CLI utility that lets you quickly setup a k3s on any local or remote VM. If you already have a k3s cluster you can also use k3sup to join an additional agent to your cluster.
You can use our article on how to setup a production-ready Kubernetes cluster with k3s on Akamai cloud computing as an additional reference.
I would suggest setting up a cluster first and once that is done SSH into any agent or server with a GPU to prepare the host OS by installing the Nvidia drivers and container runtime package.
Install the Nvidia drivers, for example: apt install -y cuda-drivers-fabricmanager-515 nvidia-headless-515-server
This example uses driver version
515
but you should select the appropriate driver version for your hardware.
Make sure the GPU is detected on the system by running the nvidia-smi
command.
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GT 1030 On | 00000000:01:00.0 Off | N/A |
| 35% 19C P8 N/A / 19W | 92MiB / 2048MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
Install the Nvidia container runtime packages.
Add the NVIDIA Container Toolkit package repository by following the instructions at: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-apt
Install the NVIDIA container runtime: apt install -y nvidia-container-runtime
curl -ksL get.k3s.io | sh -
grep nvidia /var/lib/rancher/k3s/agent/etc/containerd/config.toml
Once the hosts have been prepared and your cluster is running, apply the NVIDIA runtime class in the cluster:
cat > nvidia-runtime.yaml <<EOF
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: nvidia
handler: nvidia
EOF
kubectl apply -f nvidia-runtime.yaml
Next install OpenFaaS in your cluster. GPU support is a feature that is only available in the commercial version of OpenFaaS.
Follow the installation instructions in the docs to install OpenFaaS using to official Helm Chart
Function deployments that require a GPU will need to have the nvidia
runtimeClass set. OpenFaaS uses profiles to support adding additional Kubernetes specific configuration to function deployments.
Create a new OpenFaaS Profile to set the runtimeClass:
cat > gpu-profile.yaml <<EOF
kind: Profile
apiVersion: openfaas.com/v1
metadata:
name: gpu
namespace: openfaas
spec:
runtimeClassName: nvidia
EOF
kubectl apply -f gpu-profile.yaml
Profiles can be applied to a function through annotations. To apply the gpu
profile to a function you need to add an annotation com.openfaas.profile: gpu
to the function configuration.
In this section we will create a function that runs the Whisper speech recognition model to transcribe an audio file.
Every OpenFaaS function is built into Open Container Initiative (OCI) format container image and published into a container registry, then when it’s deployed a fully qualified image reference is sent to the Kubernetes node. Kubernetes will then pull down that image and start a Pod from it for the function.
OpenFaaS supports various different languages through the use of its own templates concept. The job of a template is to help you create a container image, whilst abstracting away most of the boiler-plate code and implementation details.
The Whisper model is available as a python package. We will be using a slightly adapted version of the python3-http template called python-http-cuda
to scaffold our function. To provide the CUDA Toolkit from NVIDIA the python3-http-cuda
template uses nividia/cuda instead of Debian as the base image.
Create a new function with the OpenFaaS CLI then rename its YAML file to stack.yml. We do this so we don’t need to specify the name using –yaml or -f on every command.
# Change this line to your own registry
export OPENFAAS_PREFIX="ttl.sh/of-whisper"
# Pull the python templates
faas-cli template pull https://github.com/skatolo/python-flask-template
# Scaffold a new function using the python3-http-cuda template
faas-cli new whisper --lang python3-http-cuda
# Rename the function configuration file to stack.yaml
mv whisper.yaml stack.yaml
The function handler whisper/handler.py
is where we write our custom code. In this case the function retrieves an audio file from a url that is passed in through the request body. Next the whisper model transcribes the audio file and the transcript is returned in the response.
import tempfile
from urllib.request import urlretrieve
import whisper
def handle(event, context):
models_cache = '/tmp/models'
model_size = "tiny.en"
url = str(event.body, "UTF-8")
audio = tempfile.NamedTemporaryFile(suffix=".mp3", delete=True)
urlretrieve(url, audio.name)
model = whisper.load_model(name=model_size, download_root=models_cache)
result = model.transcribe(audio.name)
return (result["text"], 200, {'Content-Type': 'text/plain'})
The first time the function is invoked it will download the model and save it to the location set in the models_cache
variable, /tmp/models
. Subsequent invocations of the function will not need to refetch the model.
It is good practice to make your function only write to the
/tmp
folder. This way you can make the function file system read-only. OpenFaaS supports this by settingreadonly_root_filesystem: true
in the stack.yaml file. Only the temporary/tmp
folder will still be writable. This prevents the function from writing to or modifying the filesystem and provides tighter security for your functions.
Before we can build, deploy and run the function there are a couple of configuration settings that we need to run through.
Add runtime dependencies
Our function handler uses the openai-whisper
python packages. Edit the whisper/requirements.txt
file and add the following line:
openai-whisper
The whisper package also requires the command-line tool ffmpeg for audio transcoding. It needs to be installed in the function container. The OpenFaaS python3 templates support specifying additional packages that will be installed with apt through the ADDITIONAL_PACKAGE
build arguments.
Update the stack.yaml
file:
functions:
whisper:
lang: python3-http-cuda
handler: ./whisper
image: whisper:0.0.1
+ build_args:
+ ADDITIONAL_PACKAGE: "ffmpeg"
Apply profiles
The function will need to use the alternative nvidia
runtime class in order to use the GPU. This can be applied by using the OpenFaaS gpu
profile created earlier. Add the com.openfaas.profile: gpu
annotations to the stack.yaml
file:
functions:
whisper:
lang: python3-http-cuda
handler: ./whisper
image: whisper:0.0.1
+ annotations:
+ com.openfaas.profile: gpu
Configure timeouts
It is common for inference or other machine learning workloads to be long running jobs. In this example transcribing the audio file can take some time depending on the size of the file and the GPU speed. To ensure the function can run to completion timeouts for the function and OpenFaaS components need to be configured correctly.
For more info see: Expanding timeouts.
functions:
whisper:
lang: python3-http-cuda
handler: ./whisper
image: whisper:0.0.1
+ environment:
+ write_timeout: 5m5s
+ exec_timeout: 5m
Once the function is configured you can deploy it straight to the Kubernetes cluster using the faas-cli
:
faas-cli up whisper
Then, invoke the function when ready.
curl -i http://127.0.0.1:8080/function/whisper -d https://example.com/track.mp3
Depending on the number of GPUs available in your cluster and the available memory for each GPU you might want to limit the amount of requests that can go to the whisper function at once. Kubernetes doesn’t implement any kind of request limiting for applications, but OpenFaaS can help here.
To prevent overloading the Pod and GPU, we can set a hard limit on the number of concurrent requests the function can handle. This is done by setting the max_inflight
environment variable on the function.
For example if your GPU has enough memory to handle 6 concurrent requests you can set max_inflight: 6
. Any subsequent requests would be dropped and receive a 429 response. This assumes the producer can buffer the requests to retry them later on. Fortunately, when using async in OpenFaaS, the queue-worker does just that, you can learn how here: How to process your data the resilient way with back pressure
functions:
whisper:
lang: python3-http-cuda
handler: ./whisper
image: ttl.sh/of-whisper:0.0.1
environment:
write_timeout: 5m5s
exec_timeout: 5m
+ max_inflight: 6
You can still try out the Whisper inference function even if you don’t have a GPU available or when you don’t have the commercial version of OpenFaaS. With only a couple of changes the function can run with CPU inference.
The function handler does not need to change. The openai-whisper
package automatically detects whether a GPU is available and will fall back to using CPU as a default.
Change the template of the function in the stack.yaml
file to python3-http
and remove the gpu
profile annotation.
whisper:
- lang: python3-http-cuda
+ lang: python3-http
handler: ./whisper
image: ttl.sh/of-whisper:0.0.1
- annotations:
- com.openfaas.profile: gpu
Pull the python3-http
template.
faas-cli template store pull python3-http
Deploy the function and invoke it with curl as shown in the previous section. The function will now run the inference in CPU instead. Depending on your hardware this will probably increase the execution time compared to running on GPU. Make sure to adjust your timeouts as required.
Take a look at some other patterns that can be useful for running ML workflows and pipelines with OpenFaaS.
In this tutorial we showed how a K3s cluster can be configured with NVIDIA container runtime support to run GPU enabled containers OpenFaaS was installed in the cluster with an additional gpu
Profile that is required to run functions with an alternative nvidia runtimeClass. Using a custom Python template that includes the CUDA Toolkit from NVIDIA we created a function to transcribe audio files with the OpenAI Whisper model.
We ran through several configuration steps for the function to set appropriate timeouts and applied the OpenFaaS gpu
profile to make the GPU available in the function container. Additionally we discussed how OpenFaaS features like async invocations and retries can be used together with concurrency limiting to prevent overloading your GPU while still making sure all requests can run to completion.
For people who don’t have a GPU available or that are running the Community Edition of OpenFaaS, we showed how the same function can be deployed to run with CPU inference.
We showed you how to apply concurrency limiting to make sure the GPU wasn’t overwhelmed with requests, however Kubernetes does have a very basic way of scheduling Pods to GPUs. The approach taken is to exclusively dedicate at least 1 GPU to a Pod, so if you wanted the function to scale, you’d need several nodes each with at least one GPU.
In Kubernetes this is done by passing in an additional value to the Pod under the requests/limits section i.e.
resources:
limits:
nvidia.com/gpu: 1
We’re looking into the best way to add this for OpenFaaS functions - either directly for each Function Custom Resource, or via a Profile, so feel free to reach out if that’s of interest to you.
]]>With the latest versions of the OpenFaaS helm charts, watchdog and python-flask template, you can now stream responses using Server Sent Events (SSE) directly from your functions. Prior to these changes, if a chat completion was going to take 10 seconds to emit several paragraphs of text, the user would have had to wait that long to see the first word.
Now, the first word will be displayed as soon as it’s available from the OpenAI API. This is a great way to improve the user experience of your OpenAI-powered applications and was requested by one of our customers building a chat-driven experience for DevOps.
Server Sent Events (SSE) are a way to stream data from a server to a client. They are a simple way to push data from a server to a client, and are used in a variety of applications, including chat applications, real-time analytics, and more.
An alternative to SSE is long polling, where the client makes a request to the server and waits for a response. This is a less efficient way to stream data, as it requires the client to make a new request every time it wants to receive new data.
SSEs only work in one direction, so the client cannot send data back to the server. If two-way communication is required, then websockets are a better option.
If we use the python3-flask template, it has built-in support for returning a streaming response from Flask, using the stream_with_context()
helper. This is a generator function that yields data to the client as it becomes available.
You can pull down the Python template using faas-cli template store pull python3-flask-debian
, then create a new function with: faas-cli new --lang python3-flask-debian stream
.
We’re using the debian
variant instead of the normal, smaller alpine variant of the image because it contains everything required to build the dependencies we’ll need. On balance, the Debian image is still smaller than the Alpine one when all the build tools have been added in.
To learn more about the Python template, see the docs.
Example handler.py
:
from flask import stream_with_context, request,Response
import requests
from langchain_community.chat_models import ChatOpenAI
from os import environ
environ["OPENAI_API_KEY"] = "Bearer foo"
chat_model = ChatOpenAI(
model="gpt-3.5-turbo",
openai_api_base="https://openai.inlets.dev/v1",
)
def handle(req):
prompt = "You are a helpful AI assistant, try your best to help, respond with truthful answers, but if you don't know the correct answer, just say sorry I can't help. Answer this question: {}".format(req)
print("Prompt: {}".format(prompt))
def stream():
for chunk in chat_model.stream(prompt):
print(chunk.content+"\n", flush=True, end="")
yield f'data: {chunk.content}\n\n'
return Response(stream(), mimetype='text/event-stream')
Example requests.txt
:
requests
langchain_community
openai
Next, in your stack.yaml file, set buffer_body: true
under the environment:
section. This reads all of the request input into memory, then sends it to the function, so there’s no streaming input, just a streaming output.
I set up a self-hosted API endpoint that is compatible with OpenAI for this testing, but you can use the official API endpoint too. Just make sure you pass in your OpenAI token using an OpenFaaS secret and not an environment variable. Definitely don’t hard-code it into your function’s source code because it will be readable by anyone with the image.
curl -i http://127.0.0.1:8080/function/stream \
-H "Content-Type: text/plain" \
-H "Accept: text/event-stream" \
-d "What are some calorie dense foods?"
Example output:
HTTP/1.1 200 OK
Content-Type: text/event-stream; charset=utf-8
Date: Thu, 11 Jan 2024 13:33:04 GMT
Server: waitress
Transfer-Encoding: chunked
data: Some
data: cal
data: orie
data: dense
data: food
data: s
data: include
data: n
data: uts
data: ,
data: se
data: eds
data: ,
data: av
data: oc
data: ados
data: ,
data: che
data: ese
data: ,
data: pe
data: an
data: ut
data: ut
data: but
data: ter
data: ,
data: dark
data: ch
data: oc
data: olate
...
I trimmed the response, but you get the idea. This gave me text quite quickly, but if we’d had to wait for the full text it would have taken up to 30 seconds.
As a quick note, you’ll need to pay attention to your timeout values as the default timeouts for your function and installation may not be enough to stream a complete response from the remote API.
The prompt could probably do with some tuning, just edit handler.py and let me know what you come up with.
I used c0sogi/llama-api to set up a local OpenAI REST API endpoint using a free model. The answers are not the same caliber as gpt-3.5, however it is a good way to test the SSE functionality.
You can learn more about the official OpenAI Python SDK here.
Functions can also be called asynchronously, but if you’re going down this route, it probably doesn’t make sense to use server sent events. You can learn more about asynchronous functions in the OpenFaaS docs.
In a short period of time, we were able to add support to the various OpenFaaS components and Python template in order to support SSE for OpenAI. You could also use a generator to stream back your own data to a client, just remember that the response is text-based. So to stream back binary data like an image, you’d need to base64 encode each chunk.
From here, you can now consume the streaming function in a front-end built with React, Vue.js, or Nuxt.js, etc, or from a CLI application.
So what do you need to try out OpenFaaS?
If you’re familiar with Kubernetes, you can get started with the Helm chart.
If Kubernetes is your antithesis, then you might like faasd instead, which can run on a single VM, and I’ve written a comprehensive manual for you to get started with it.
If you’d like to learn more about OpenFaaS, we have a weekly call every Wednesday and we’d love to see you there to hear how you’re using functions.
]]>All the component parts are readily available to take user-supplied source-code, produce an OpenFaaS function, and deploy it with its own custom HTTPS URL. The target user for this kind of workflow is a SaaS company, or an internal platform team who want to offer a “code to URL” experience for their users.
Learn more about multi-tenant use of OpenFaaS here: Build a Multi-Tenant Functions Platform with OpenFaaS
If you follow all of the steps in this guide, then you’ll be able to take code like this from a user:
"use strict"
module.exports = async (event, context) => {
const result = {
status: "Received input: " + JSON.stringify(event.body)
};
return context
.status(200)
.succeed(result);
}
And turn it into a HTTPs URL like this one: https://helloworld.webhooks.example.com
At a conceptual level, here’s what’s involved:
Conceptual diagram showing overview of flow
Most of the steps in this tutorial will be shown using manual HTTP calls. This is so you can understand the role of each component, however when it comes to building your own integration, you could make these calls from your own code, or even write an OpenFaaS function to do it.
You’ll need a retail or trial license for OpenFaaS for Enterprises. Reach out if you’d like to try this tutorial and let us know what you’re building.
You’ll also need:
clusterRole: true
faas-cli
).You can create a separate Kubernetes namespace for a tenant via the HTTP REST API.
The name must conform to DNS naming rules, and must also be unique. You could use a GUID, and record a mapping in your application or use a human-readable name. Names must not begin with a number.
faas-cli namespace create tenant-1
I won’t repeat the HTTP API call here, however you can view it in the OpenFaaS REST API docs.
If your users only have one namespace, you may name it after them, i.e. tenant-1
, but if they can have multiple, you’ll want to add some annotations so that you can identify them later.
faas-cli namespace create webhooks \
--annotation "tenant=tenant-1" \
--annotation "email=alex@example.com"
Here we call the namespace “webhooks”, then add annotations to map some of our own custom data.
You’ll now see the extra namespace via faas-cli namespace list
.
There are two ways you could go about taking in source code:
Typically, we see our customers going for option 1, with interpreted languages such as: Node.js, PHP, Python, etc.
There are three ways to try out the Function Builder API:
curl
and bash commands to build a package and invoke the endpointfaas-cli publish --remote-builder
which uses faas-cli to do all of the steps in 1.A conceptual diagram showing how to make a call to the Function Builder API:
Conceptual diagram showing user code being shrink-wrapped using a template, and submitted to the Function Builder.
Assuming you’ve deployed the Function Builder API, port-forward it to your local machine:
kubectl port-forward -n openfaas \
deploy/pro-builder 8081:8080
We’ll try the method outlined in curl, because it shows each step that’s required.
Obtain the payload secret required to sign the request:
export PAYLOAD=$(kubectl get secret -n openfaas payload-secret -o jsonpath='{.data.payload-secret}' | base64 --decode)
echo $PAYLOAD > $HOME/.openfaas/payload.txt
Prepare a temporary directory
rm -rf /tmp/functions
mkdir -p /tmp/functions
cd /tmp/functions
Create a new function
faas-cli new --lang node18 hello-world
The --shrinkwrap
flag performs templating without actually invoking docker or buildx to build or publish an image. The Function Builder API will do that for us instead.
faas-cli build --shrinkwrap -f hello-world.yml
If you look in the ``./build/hello-world` folder you’ll see a build context that can be built with Docker.
Now rename “hello-world” to “context” since that’s the folder name expected by the builder
cd build
rm -rf context
mv hello-world context
Then, create a config file with the registry and the image name that you want to use for publishing the function.
Build-args can also be specified here for proxies, or enabling/disabling Go modules for instance.
export DOCKER_USER=alexellis2
echo -n '{"image": "ttl.sh/'$DOCKER_USER'/test-image-hello:0.1.0"}' > com.openfaas.docker.config
The test image will be published to the ttl.sh public and ephemeral registry which does not require authentication.
You can follow detailed instructions to set up authentication for the Docker Hub, AWS ECR, GCP GCR, or a self-hosted registry like CNCF Harbor.
Now we can invoke the Function Builder API to build and publish the function:
Create a tar of the build context:
tar cvf req.tar --exclude=req.tar .
Sign the payload:
PAYLOAD=$(kubectl get secret -n openfaas payload-secret -o jsonpath='{.data.payload-secret}' | base64 --decode)
HMAC=$(cat req.tar | openssl dgst -sha256 -hmac $PAYLOAD | sed -e 's/^.* //')
Invoke the build with the following:
curl -H "X-Build-Signature: sha256=$HMAC" -s http://127.0.0.1:8081/build -X POST --data-binary @req.tar | jq
[
....
"v: 2021-10-20T16:48:34Z exporting to image 8.01s"
],
"image": "ttl.sh/alexellis2/test-image-hello:0.1.0",
"status": "success"
}
If it was successful, you’ll get a "status": "success"
returned along with the image name you passed in. If it failed, you can return the logs
element to the user which will show any failed build or unit testing steps.
Now we can make a HTTP call to deploy the function.
Like before, there are several ways you can do this:
faas-cli deploy
Let’s use 2. with faas-cli deploy.
faas-cli deploy \
--image ttl.sh/alexellis2/test-image-hello:0.1.0 \
--name hello-world \
--namespace webhooks
We should also consider what additional settings we may want for the function at this time.
Here’s a fuller example, but by no means completely exhaustive:
faas-cli deploy \
--image ttl.sh/alexellis2/test-image-hello:0.1.0 \
--name hello-world \
--namespace webhooks \
--annotation "com.example.tenant=tenant-1" \
--annotation "com.example.plan=free" \
--label com.openfaas.scale.zero=true \
--label com.openfaas.scale.zero-duration=3m \
--label com.openfaas.scale.min=1 \
--label com.openfaas.scale.max=10 \
--label com.openfaas.scale.type=rps \
--label com.openfaas.scale.target=500 \
--memory-request=64Mi \
--memory-limit=128Mi \
--cpu-request=50Mi
The name, image, namespace settings configure the function’s deployment, then we have additional metadata supplied via annotations, and labels for autoscaling.
We enabled scale to zero, with a idle period of 3 minutes, a minimum of 1 replica of the function, and a maximum of 10. We then set the autoscaler to scale based upon a target of 500 Requests Per Second (RPS) and set a request and limit for RAM, then a requested value for CPU.
If you’d like to see the equivalent HTTP REST call, you can prefix the command with FAAS_DEBUG=1
.
PUT http://127.0.0.1:8080/system/functions
Content-Type: [application/json]
User-Agent: [faas-cli/dev]
Authorization: [Basic **********]
{
"service": "hello-world",
"image": "ttl.sh/alexellis2/test-image-hello:0.1.0",
"namespace": "webhooks",
"labels": {
"com.openfaas.scale.max": "10",
"com.openfaas.scale.min": "1",
"com.openfaas.scale.target": "500",
"com.openfaas.scale.type": "rps",
"com.openfaas.scale.zero": "true",
"com.openfaas.scale.zero-duration": "3m"
},
"annotations": {
"com.example.plan": "free",
"com.example.tenant": "tenant-1"
},
"limits": {
"memory": "128Mi"
},
"requests": {
"memory": "64Mi",
"cpu": "50Mi"
}
}
The forth part of this tutorial is to create a custom domain and TLS certificate for the function so that the user can access it via a custom URL.
The Ingress Operator is an abstraction over Kubernetes Ingress that makes it quick and easy to create custom Ingress records configured to expose a function over a custom HTTP path or domain, or both. Access to functions is defined using a “FunctionIngress” Custom Resource Definition (CRD).
You’ll find examples of FunctionIngress in the documentation.
It’s not a compulsory component, and you could work with Ingress directly, or even Istio if you wished.
You’ll need to have the Ingress Operator enabled in the values.yaml file for OpenFaaS.
Then, create a DNS record for the user’s function.
On AWS EKS, LoadBalancers have DNS names, so you’d create a CNAME, but everywhere else they tend to have IP addresses, so you create an A record. These can be created via CLI tools, or via an API/SDK from your DNS provider such as AWS Route54, Google Cloud DNS, DigitalOcean, Cloudflare, etc.
If the external IP was 176.58.106.241
, then you’d create a DNS A record such as:
176.58.106.241 hello-world.webhooks.example.com
I’ve used the format of function.namespace.domain, but you can use any format that you like. Perhaps one sub-domain per tenant, or one sub-domain per namespace. In any case, having a unique sub-domain is important if user functions need to store cookies.
You’ll need to create a Let’s Encrypt issuer. Again, we recommend doing this either per tenant via a ClusterIssuer, or via an Issuer in each namespace. Details can be found in the README.
Now, create a FunctionIngress record, either using the Kubernetes clientset for the ingress-operator or the Custom Resource Definition (CRD) and kubectl.
The below assumes the ingress-controller is ingress-nginx, using a ClusterIssuer called “tenant1”:
apiVersion: openfaas.com/v1
kind: FunctionIngress
metadata:
name: helloworld-tls
namespace: openfaas
spec:
domain: "helloworld.webhooks.example.com"
function: "helloworld"
functionNamespace: "webhooks"
ingressType: "nginx"
tls:
enabled: true
issuerRef:
name: "letsencrypt-tenant1"
kind: "ClusterIssuer"
All FunctionIngress records are created in the “openfaas” namespace, then you reference the function’s namespace via the “functionNamespace” field.
Whenever traffic hits https://helloworld.webhooks.example.com
, it will be re-routed to https://gateway.openfaas:8080/function/helloworld.webhooks
.
Every Function will need its own FunctionIngress record, however Issuers and ClusterIssuers for cert-manager can be shared across multiple functions.
Now, if you expect to see many functions created each with their own domains, you may want to use a DNS01 challenge, and register a sub-domain per namespace, or per tenant instead of using individual A records and HTTP01 challenges.
With HTTP01 challenges and individual DNS records, you’d have 20 DNS records and 20 disparate TLS certificates. On the plus side, this is the simplest configuration, and on the downside you may run into Let’s Encrypt rate limits.
function1.tenant1.example.com
, function2.tenant1.example.com
, function3.tenant1.example.com
, etc
With a DNS01 challenge and a sub-domain, you’d have a single wildcard DNS record with a singled TLS certificate, used for as many functions as you wanted.
*.tenant1.example.com
You can learn the differences here: ACME Challenge Types
The goal of this tutorial was to show you how you could accept source code from your users, build it, deploy it, and then create a custom URL for them.
We tend to see individual teams having very strong opinions on how to tie together these various steps, which is why I’ve given you the building blocks to do with them as you wish.
If I were to automate all of the steps in this guide, then I’d write an OpenFaaS function using Go which could take advantage of the Go SDK for the OpenFaaS REST API, the Go example for the Function Builder, and the Go Kubernetes clientset for the Ingress Operator. You’d need to trigger the whole workflow from your user’s dashboard or the rest of your application, and we have some examples of people who are doing tha tin the blog post Build a Multi-Tenant Functions Platform with OpenFaaS.
You can monitor the OpenFaaS REST API and the Function Builder API, along with the functions using separate dashboards available in the Customer Community.
If you wish to give your users API access to OpenFaaS via the CLI and UI Dashboard, then you’ll need to configure an Identity Provider (IdP) and SSO/IAM for OpenFaaS.
If you have comments, questions, and suggestions or would just like to talk to us about this guide, you can reach us here.
You may also like:
]]>AWS Lambda was introduced by Amazon in 2014 and popularized the Functions As a Service (FaaS) model. OpenFaaS was introduced in 2016 and is one of the most popular FaaS platforms for Kubernetes.
Here’s what we hear users value in OpenFaaS over a hosted service:
In this article we will take a look at the differences and similarities between OpenFaaS and AWS Lambda and show you what it takes to migrate a Python function across.
We will work through migrating a real-world function that uses the Extract, Transform, and Load (ETL) pattern to take an input video and provide a short video preview. We want to say thank you to Akamai for providing credits for the cluster, we’ll be using their managed Kubernetes service called “LKE”. You could just as easily follow this guide on your laptop with KinD, on AWS EKS, or on-premises with K3s.
Example video generated by the function
The following source video was used as the input to generate an 8 second video summary: https://www.youtube.com/watch?v=l9VuQZ4a5pc
Example video generated by the ETL workflow
In Lambda, when you want to write a function in a certain language, you pick a “runtime”. There are a set of supported runtimes, plus the ability to write your own custom ones. The equivalent for OpenFaaS is a “template” - we provide a number of supported templates, then there are around a dozen more in the function store provided by the community.
In addition to the templates, OpenFaaS can also run CLIs as functions such as networking tools like curl or nmap, and can support running existing containers serving HTTP traffic. Find out more about workloads.
Below, we compare the python3-http OpenFaaS template using Python 3.11 to the Lambda Python 3.10 runtime.
Let’s start by looking at a simple Lambda function that reads request parameters like the headers and the HTTP method.
def lambda_handler(event, context):
# Get request data from the event
method = event['requestContext']['http']['method'],
requestBody = event['body']
contentType = event['headers'].get('content-type')
return {
"statusCode": 200,
"body": {
"method": method,
"request-body": requestBody,
"content-type": contentType
}
}
Lambda functions can be triggered from different sources:
The event object will be different depending on the source that triggered the function. In this example we are handling an event triggered by a direct API call to the Lambda function.
Now let’s compare this Lambda function to an equivalent OpenFaaS function. In this case the OpenFaaS function uses the python-http template which is our recommended template for Python users.
def handle(event, context):
# Get request data from the event
method = event.method,
requestBody = str(event.body, 'utf-8')
contentType = event.headers.get('content-type')
return {
"statusCode": 200,
"body": {
"method": method,
"request-body": requestBody,
"content-type": contentType
}
}
In this example, the handler along with the request and response are all very similar, whilst not 100% equivalent. That means porting a function should be relatively straight-forward.
Now, while the structure and type of the event payload in Lambda will vary depending on the trigger, OpenFaaS functions always have the same payload format. The event always contains the same data about the request: body, headers, method, query and path.
In OpenFaaS, every invocation will happen over HTTP, whether it was triggered by an event in Apache Kafka, or a direct HTTP call through the gateway. Just like AWS Lambda, OpenFaaS supports different event sources. The concept we use is a connector. Some of our popular connectors include: Apache Kafka, AWS SQS or cron if a function needs to be invoked on a schedule.
To sum up: Python code looks very similar in Lambda and OpenFaaS. To migrate simple functions you would only need to update your code to handle the different format of the event and context objects. For more complex code, where you rely on a trigger like an SQS event, you will also need to configure and deploy a connector. In Lambda, a function can assume a role and access other services without using credentials. There is similar support available for OpenFaaS functions when running on AWS EKS, however if you’re running on-premises, you may need to obtain credentials and provide them to your function to access databases, object storage, and other managed services.
See also: AWS EKS: Configuring Pods to use a Kubernetes service account
Building and deploying
To deploy a Lambda function, compiled code or scripts and their dependencies need to be built into a deployment package. Lambda supports two types of deployment packages: a zip file, or a container image derived by a supported base image. A deployment package has to be uploaded to S3 or ECR and can then be used to deploy a function.
This usually involves a number of tools like the Lambda console, AWS CLI and some scripts to create the deployment package.
There are CLIs and tools available that automate some of these steps and provide a way to define and manage your Lambda functions e.g. AWS SAM CLI
The main tool to interact with OpenFaaS and build and deploy functions is the faas-cli. The CLI uses Docker to build functions from a set of supported language templates. You can also create your own templates or build functions from a custom Dockerfile. This means that you can use any programming language or toolchain that can be packaged into a container image.
A quick comparison of the developer experience between AWS Lambda and OpenFaaS.
AWS Lambda:
OpenFaaS:
faas-cli
. Build and deployment configuration is provided through a stack.yml
configuration file.faas-cli local-run
. It is easy to spin up an OpenFaaS cluster locally or use faasd to test functions.Want to see a demo of how faas-cli local-run
works? The faster way to iterate on your OpenFaaS functions
See also: OpenFaaS YAML reference for stack.yaml
In the next section we will show you how to deploy an OpenFaaS cluster and walk through the steps required to migrate a real-world workflow from Lambda to OpenFaaS.
Let’s start be setting up an OpenFaaS cluster. We will be using a managed LKE cluster on Akamai Cloud to deploy OpenFaaS.
A new cluster can be created from the Linode dashboard. Follow their getting started guide to setup the Kubernetes cluster.
Did you know? Linode was acquired by Akamai, and is now being branded as “Akamai Cloud Computing”. The rebranding is still in-progress, so we’ll be referring to Linode throughout this article.
For this tutorial I created a 3x node cluster where each worker had 4GB RAM and 2vCPUs.
Once you have a cluster deployed an verified you are able to access it move on to the next section to install OpenFaaS.
The version of OpenFaaS you install will depend on your needs, but both will work for this tutorial.
First first option is the Community Edition (CE) which is intended for enthusiasts, experimentation, and for building a Proof of Concept (PoC). You’ll get a taste of what the platform is like, with some limits on scalability and intended use.
For commercial use and for production, we recommend OpenFaaS Standard. OpenFaaS for Enterprises is an alternative distribution which is best suited to people wanting to build a white-label functions service, or for regulated companies.
Whichever version you use, installation is quick with the OpenFaaS Helm chart, or arkade which provides a simpler interface to the Helm chart.
arkade install openfaas
You can now run arkade info openfaas
to get the instructions to log in with the CLI and obtain the password to access the UI.
To follow along with this tutorial you can use the suggested port-forwarding instructions returned by the info command to access the OpenFaaS gateway.
For more information:
In a previous article, we showed how to build a Highly Available cluster using VMs and K3s. If you’re an Akamai customer or k3s user, you can find out more here: How to set up production-ready K3s with OpenFaaS with Akamai Cloud Computing.
Most of the time your Lambda functions will probably use IAM to access other AWS services and have additional dependencies like Python packages or other native dependencies. That is still possible with OpenFaaS if you deploy to AWS EKS, but it requires some additional configuration.
Since AWS IAM is not available on LKE, we will not include the steps to have an OpenFaaS function assume an IAM role. Instead we’ll create credentials and pass them to the function, this makes the example portable to any cloud and to on-premises environments.
The ETL workflow we will be migrating is a video transformation pipeline using FFmpeg. A function fetches video and creates a short preview of the input by sampling frames throughout the video and stitching then back together to create the output video.
Video transformation workflow
Let’s start by taking a look at the Lambda function.
import os
import logging
import tempfile
import urllib
import ffmpeg
import boto3
from .preview import generate_video_preview, calculate_sample_seconds
s3_client = boto3.client('s3')
samples = os.getenv("samples", 4)
sample_duration = os.getenv("sample_duration", 2)
scale = os.getenv("scale")
format = os.getenv("format", "mp4")
s3_output_prefix = os.getenv("s3_output_prefix", "output")
debug = os.getenv("debug", "false").lower() == "true"
def handler(event, context):
s3_bucket_name = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
file_name, _ = os.path.basename(key).split(".")
output_key = os.path.join(s3_output_prefix, file_name + "." + format)
out_file = tempfile.NamedTemporaryFile(delete=True)
try:
input_url = s3_client.generate_presigned_url('get_object', Params={'Bucket': s3_bucket_name, 'Key': key}, ExpiresIn=60 * 60)
except Exception as e:
logging.error("failed to get presigned video url")
raise e
try:
probe = ffmpeg.probe(input_url)
video_duration = float(probe["format"]["duration"])
except ffmpeg.Error as e:
logging.error("failed to get video info")
logging.error(e.stderr)
raise e
# Calculate sample_seconds based on the video duration, sample_duration and number of samples
sample_seconds = calculate_sample_seconds(video_duration, samples, sample_duration)
# Generate video preview
try:
generate_video_preview(input_url, out_file.name, sample_duration, sample_seconds, scale, format, quiet=not debug)
except Exception as e:
logging.error("failed to generate video preview")
raise e
# Upload video file to S3 bucket.
try:
s3_client.upload_file(out_file.name, s3_bucket_name, output_key)
except Exception as e:
logging.error("failed to upload video preview")
raise e
Source: video_preview/lambda_function.py
An Amazon S3 trigger will invoke the function each time a source video is uploaded to the bucket. The function will lookup the bucket name and key of the source video from the event parameters it receives from S3. Next it will use ffmpeg to generate a video preview from the input video. It does this by taking short samples spread throughout the input video and stitching them back together to create a new video. The ffmpeg output is saved to a temporary file that is uploaded to S3 again.
This is what the video generation code looks like:
def sample_video(stream, sample_duration, sample_seconds=[]):
samples = []
for t in sample_seconds:
sample = stream.video.trim(start=t, duration=sample_duration).setpts('PTS-STARTPTS')
samples.append(sample)
return samples
def generate_video_preview(in_filename, out_filename, sample_duration, sample_seconds, scale, format, quiet):
stream = ffmpeg.input(in_filename)
samples = sample_video(stream, sample_duration=sample_duration, sample_seconds=sample_seconds)
stream = ffmpeg.concat(*samples)
if scale is not None:
width, height = scale.split(':')
stream = ffmpeg.filter(stream, 'scale', width=width, height=height, force_original_aspect_ratio='decrease')
(
ffmpeg
.output(stream, out_filename, format=format)
.overwrite_output()
.run(quiet=quiet)
)
Source: video_preview/preview.py
The FFmpeg bindings package, ffmpeg-python is used to interact with FFmpeg. This means our Lambda function requires the ffmpeg-python package and ffmpeg as a runtime dependency. In the previous section we talked about the different methods to include runtime dependencies. Either by including them in the .zip file archive for your Lambda function or by creating a container image.
Checkout the AWS docs for more info on how to add runtime dependencies to Python Lambda functions.
The function can be configured through env variables. Some of the parameters include:
samples
- The number of samples to take from the source video.sample_duration
- The duration of each sample.scale
- Resize the output video to this scale, width:height
.format
- The output video format, e.g. mp4
, webm
, flv
A couple of things to note for this function:
s3Client = boto3.client('s3')
. AWS maps the execution environment to the account and IAM role of the lambda function. This will allow it to access your AWS S3 bucket.For a detailed overview on how to create a Lambda function that is triggered by S3 bucket events and how to configure the required IAM roles and permissions, take a look this tutorial: Using an Amazon S3 trigger to invoke a Lambda function
In this section we will take the function code from our Lambda function and walk through the steps required to run it as an OpenFaaS function. Since our function will be deployed to a Linode EKS cluster we will also be migrating from AWS S3 to Linode Object Storage at the same time.
To migrate the function we will need to:
Setup Linode Object Storage
You can follow the official get started guide to enable Object storage on Linode and create a new bucket. For this demo we created a bucket named video-preview
.
Make sure to save the access-key and access-secret for the bucket at the following path:
.secrets/video-preview-s3-key
.secrets/video-preview-s3-secret
faas-cli local-run
uses the .secrets
folder to look for secrets files when running the function locally for development.
We will also be using these files to create the required secrets in our OpenFaaS cluster before we deploy the function.
Create the OpenFaaS function
Scaffold a new Python function using the faas-cli:
# Pull the python3-http template from the store
faas-cli template store pull python3-http
# Scaffold the function.
faas-cli new video-preview --lang python3-http
# Rename the function configuration file.
mv video-preview.yml stack.yml
We are using the python3-http
template to scaffold the function. This template creates a minimal function image based on alpine linux. If your functions depends on modules or packages that require a native build toolchain such as Pandas, Kafka, SQL etc. we recommend using the python3-http-debian template instead.
Once the video-preview
function is created from the template you can copy over the code from the Lambda function. We will start refactoring it step by step. Make sure the also copy the preview.py
file from the Lambda function.
Initialize the S3 client
The S3 client in the boto3 SDK can be used with any S3-compatible Object storage. We won’t have to swap out the client to make it work with Linode Object storage. However, we will need to configure the client with access credentials and the correct endpoint URL.
For more info checkout the guide: Using the AWS SDK for Python with (boto3) with Linode Object storage
Even if you opted to keep using AWS S3 for object storage, your OpenFaaS functions will not able to automatically map an IAM role to access S3. You will need to create an AWS user with the same role the Lambda function was using so that you can get the appropriate access keys for the S3 client.
Instead of initializing the boto3 client with the defaults we will create a separate function, init_s3
, to configure the client with the required parameters. This function can be used in the function handler to initialize the S3 client the first time the function runs. After initialization the client is assigned to a global variable so that it can be reused on subsequent calls.
Update handler.py
of the video-preview
function:
import os
import logging
import tempfile
import urllib
import ffmpeg
import boto3
+ from botocore.config import Config
from .preview import generate_video_preview, calculate_sample_seconds
- s3_client = boto3.client('s3')
+ s3_client = None
s3_output_prefix = os.getenv("s3_output_prefix", "output")
debug = os.getenv("debug", "false").lower() == "true"
samples = os.getenv("samples", 4)
sample_duration = os.getenv("sample_duration", 2)
scale = os.getenv("scale")
format = os.getenv("format", "mp4")
def handle(event, context):
+ global s3_client
+ # Initialise an S3 client upon first invocation
+ if s3_client == None:
+ s3_client = init_s3()
Let’s take a look at the init_s3
function:
def init_s3():
with open('/var/openfaas/secrets/video-preview-s3-key', 'r') as s:
s3Key = s.read()
with open('/var/openfaas/secrets/video-preview-s3-secret', 'r') as s:
s3Secret = s.read()
s3_endpoint_url = os.getenv("s3_endpoint_url")
session = boto3.Session(
aws_access_key_id=s3Key,
aws_secret_access_key=s3Secret,
)
return session.client('s3', config=Config(signature_version='s3v4'), endpoint_url=s3_endpoint_url)
The S3 credentials are provided to the function as secrets. Confidential configuration like API tokens, connection strings and passwords should never be made available in the function through environment variables. Secrets can be read from the following location in the function container: /var/openfaas/secrets/<secret-name>
.
The init_s3
function reads the S3 key and secret from the file system. The S3 endpoint URL is read from an environment variable. Next, these parameters are used to initialize the client.
The function configuration in the stack.yml
file needs to be updated. To tell OpenFaaS which secrets to mount for a function add the secret names to the secrets
section. Also include the s3_endpoint_url
for your Linode region in the environment
section.
functions:
video-preview:
lang: python3-http
handler: ./video-preview
image: welteki/video-preview:0.0.2
+ environment:
+ - s3_endpoint_url: https://fr-par-1.linodeobjects.com
+ secrets:
+ - video-preview-s3-key
+ - video-preview-s3-secret
Make sure the secrets are created in the OpenFaaS cluster before deploying the functions. Secrets can be created in several ways, either through the REST API or using the faas-cli
. In this example we will use the faas-cli to create the secrets.
faas-cli secret create video-preview-s3-key \
--from-file .secrets/video-preview-s3-key
faas-cli secret create video-preview-s3-secret \
--from-file .secrets/video-preview-s3-secret
You can checkout the documentation for more info on how to use secrets within your functions.
Add code dependencies
With AWS Lambda extra binaries, packages and modules the function code depends on need to be included in the deployment package. For Lambda this deployment package can either be a .zip file archive or container image.
OpenFaaS functions are always built into a container image. Our official templates support including dependencies in the function image without having to create your own template and Dockerfile.
For Python functions modules and packages can be added by including them in the requirements.txt
. Additional packages can be installed in the function image through build arguments.
The function handler folder includes a requirements.txt
file that was created while scaffolding the video-preview function from the python-http
template. All Python packages the function code depends on need to be added here.
The video-preview
function uses the official AWS SDK for python, boto3
to upload files to any S3-compatible Object storage. The ffmpeg-python
python package provides bindings to FFmpeg and is used to process the input video. Make sure both are included in the requirements.txt
file:
boto3
python-ffmpeg
You have to make sure all additional binaries the code depends on are installed in the function image. In this case our code depends on FFmpeg. Additional packages can be installed in the function image through build arguments.
With the official python-http template the build argument, ADDITIONAL_PACKAGE
can be used specify additional [apk](https://wiki.alpinelinux.org/wiki/Alpine_Package_Keeper)
or [apt](https://wiki.debian.org/AptCLI)
packages that need to be installed.
Update the functions stack.yml
configuration to include FFmpeg as an additional package:
functions:
video-preview:
lang: python3-http
+ build_args:
+ ADDITIONAL_PACKAGE: "ffmpeg"
See the docs for more details on adding native dependencies to OpenFaaS Python functions.
Refactor the function handler
The handle
function will need to be updated to handle the different format and type of the event
parameter.
Our Lambda function used an S3 trigger that invoked the function each time a new video was uploaded to AWS S3 bucket. At the moment of writing Linode Object Storage does not have support for bucket notifications so we will update our function handler to accept a JSON payload with a download link instead.
If you want to copy the AWS workflow and trigger the function on bucket notifications you could add Ceph storage to your cluster with Rook. It has support for setting up S3 compatible Object storage and sending bucket notifications over HTTP. Minio is another option that also supports sending bucket notifications over HTTP.
Configuring any of these is outside the scope of this post.
import os
+ import json
import logging
import tempfile
import ffmpeg
import boto3
from botocore.config import Config
from .preview import generate_video_preview, calculate_sample_seconds
s3_client = None
samples = os.getenv("samples", 4)
sample_duration = os.getenv("sample_duration", 2)
scale = os.getenv("scale")
format = os.getenv("format", "mp4")
s3_output_prefix = os.getenv("s3_output_prefix", "output")
+ s3_bucket_name = os.getenv('s3_bucket')
debug = os.getenv("debug", "false").lower() == "true"
def handle(event, context):
global s3_client, s3_endpoint
# Initialise an S3 client upon first invocation
if s3_client == None:
s3_client = init_s3()
+ data = json.loads(event.body)
+ input_url = data["url"]
- file_name, _ = os.path.basename(key).split(".")
+ file_name, _ = os.path.basename(input_url).split(".")
output_key = os.path.join(s3_output_prefix, file_name + "." + format)
out_file = tempfile.NamedTemporaryFile(delete=True)
- try:
- input_url = s3_client.generate_presigned_url('get_object', Params={'Bucket': s3_bucket_name, 'Key': key}, ExpiresIn=60 * 60)
- except Exception as e:
- logging.error("failed to get presigned video url")
- raise e
try:
probe = ffmpeg.probe(input_url)
video_duration = float(probe["format"]["duration"])
except ffmpeg.Error as e:
logging.error("failed to get video info")
logging.error(e.stderr)
raise e
# Calculate sample_seconds based on the video duration, sample_duration and number of samples
sample_seconds = calculate_sample_seconds(video_duration, samples, sample_duration)
# Generate video preview
try:
generate_video_preview(input_url, out_file.name, sample_duration, sample_seconds, scale, format, quiet=not debug)
except Exception as e:
logging.error("failed to generate video preview")
raise e
# Upload video file to S3 bucket.
try:
s3_client.upload_file(out_file.name, s3_bucket_name, output_key, ExtraArgs={'ACL': 'public-read'})
except Exception as e:
logging.error("failed to upload video preview")
raise e
Source: video-preview/handler.py
Changes made to the handler function:
s3_bucket_name
to the environment
section in the stack.yml
file.These are the minimal changes required to run our code as an OpenFaaS function.
Deploy the OpenFaaS function
Before you go ahead and deploy the function to the OpenFaaS cluster make sure to check the stack.yml
file. After adding all the configuration options from the previous steps it should look something like this:
version: 1.0
provider:
name: openfaas
gateway: http://127.0.0.1:8080
functions:
video-preview:
lang: python3-http
build_args:
ADDITIONAL_PACKAGE: "ffmpeg"
handler: ./video-preview
image: welteki/video-preview:0.0.2
environment:
s3_bucket: video-preview
s3_endpoint_url: https://fr-par-1.linodeobjects.com
write_timeout: 10m2s
read_timeout: 10m2s
exec_timeout: 10m
secrets:
- video-preview-s3-key
- video-preview-s3-secret
Source: stack.yml
Note that we included three additional environment variables to configure the function’s timeouts. Transforming and transcoding videos can take some time depending on the size of the source video. If you have long running functions make sure the timeouts are configured properly so your functions can finish their work.
For quick iterations and testing during development OpenFaaS functions can be run locally with docker using the faas-cli local-run
command. We show how to use this feature in our blog post: The faster way to iterate on your OpenFaaS functions.
To deploy the function run:
# URL to the OpenFaaS gateway
export OPENFAAS_URL="https://openfaas.example.com"
faas-cli up
This will build the function, push the resulting image and deploy the function to your OpenFaaS cluster.
Invoke the function with curl to test it:
curl -i https://openfaas.example.com/function/video-preview \
-H 'Content-Type: application/json' \
-d '{"url": "https://video-preview.fr-par-1.linodeobjects.com/input/openfaas-homepage-vid.webm"}'
If the invocation was successful you should be able to find the processed video in the output
folder in your S3 bucket. You can login to the Linode Cloud Manager to check the content of the bucket.
We refactored our video-preview function to accept a url in a JSON payload. You can improve the function by accepting a trigger from S3.
We migrated a long running video transformation function that can be resource intensive. To prevent overloading the function you cloud set limits and configure autoscaling. Checkout these blog post to learn how this can be done:
We saw how to deploy OpenFaaS in a managed Kubernetes cluster with LKE. Alternatively you can create a cluster yourself. Checkout our tutorial: How to set up production-ready K3s with OpenFaaS with Akamai Cloud Computing
We migrated our ETL pipeline from being restricted to only be deployable on AWS Lambda infrastructure to being completely portable by using OpenFaaS.
Additionally developers are able to test their functions locally by either using the faas-cli local-run
command or deploying an OpenFaaS cluster locally.
If you are using Node.js or JavaScript with AWS Lambda, then there’s a similar guide you can follow which also touches on how to use IAM Service Accounts in AWS EKS: Migrate Your AWS Lambda Functions to Kubernetes with OpenFaaS
You may also like:
]]>In this post I’ll give an overview of what we learned spending a week investigating a customer issue with scaling beyond 3500 functions. Whilst navigating this issue, we also implemented several optimisations and built new tooling for testing Kubernetes at scale.
If you’ve ever written software, which other people install and support, then you’ll know how difficult and time-consuming it can be to debug and diagnose problems remotely. In this case it was no different, with our team spending over a week of R&D trying to reproduce the problem, pin-point the cause, and remediate it.
We had a support request from a customer that was running more functions than the typical team, so we decided to take a look at the problem and see what we could do to help them out.
In this post, you’ll see my thought process, and input from the OpenFaaS and Kubernetes community. Thanks to everyone who made suggestions or the problem over with me.
How many functions is a normal amount?
You may be tempted think of “serverless” through the lens of a cloud hosting company like AWS, Google Cloud or Azure, and their large scale multi-tenant products. With that perspective, it’s tempting to think that every other functions platform should work in exactly the same way, and at the same scale, going up to millions of functions per installation.
We should perhaps pause and understand the target user for OpenFaaS, which is largely self-selecting and made up of individual product teams.
We are aware of a handful of users running thousands of functions in production, usually as part of a system to provide customers with a sandbox for custom code. More recently, we’ve started to work with ISPs around the world who want to bring a functions experience to their platform, but even there, uptake takes time, and may never quite hit the numbers of a large cloud company.
Most users adopt OpenFaaS for portability, to be able to install on different clouds or directly onto customer hardware, for cost optimisation vs hosted functions, or in the case of Surge, because the of all the effort that’s been spent on the developer experience. You can build, test and debug functions on your laptop with the same experience as you’d get in production.
“OpenFaaS has been insanely productive for our business. The ability to run the whole stack locally is really important for our developers, and the async system based upon NATS JetStream means we can fire off our jobs and forget about them. We’re now turning to Kubernetes deployments and converting them into functions.”
So it’s not that we discourage large scale, or function hosting, we are seeing growing customer interest there, it’s just that OpenFaaS is popular with individual project teams who have a few dozen functions.
What does it cost to test at scale?
In addition, with OpenFaaS running on top of Kubernetes, we have to provision a whole node for every 100-110 containers that are provisioned, including control-plane, service mesh and networking. So for 3000 functions, you need at least 30 nodes.
We all know that clusters are slow to create on a platform like AWS Elastic Kubernetes Service (EKS), and then adding nodes can take a good 3-5 minutes each. I did a calculation with the AWS cost estimator and to get to 3500 functions, you’d probably be looking at a spend of 1500 USD / mo in infrastructure costs alone.
How did we find the problem?
The problem was finally found after spending over a week building optimizations, and it was frustratingly obvious.
I started off by looking to hardware that I already owned. My workstation runs Linux and has an AMD Ryzen 9 5950x with 16C/32T with 128GB of RAM. Then, behind me sits the Ampere Developer Platform with 64C and 64 RAM. I paid an additional 500 USD to upgrade the Ampere machine to 128GB RAM in order to recreate the customer issue.
The container limit of 110 per Kubernetes node means that even if you have a bare-metal machine like this, it’s largely wasted, unless you are running a few very large Pods.
Could existing testing solutions help?
A friend mentioned the community project Kubernetes WithOut Kubelet (KWOK) as a potential option.
I did some initial testing here, and showed my work, for numerous reasons, it did not work for this use-case.
You can read the thread here, if interested: Testing an Operator with KWOK.
Could we slice up the bare-metal?
So what to do? My first instinct was to use multipass, a tool that we’ve been using for faasd development and for testing K3s, to create 30 VMs on each machine, combining them to get a 60 node cluster, which would allow for going up to at least 6000 functions, 2x over where the customer’s cluster was stalling.
the Ampere Dev Platform by ADLINK was used for initial testing
Multipass created 10 nodes in about 15 minutes, then when I the command to list VMs, it took about 60s to return the results. I knew that this was not a direction I wanted to go in.
Having written actuated over a year ago, to launch Firecracker VMs for GitLab CI and GitHub Actions jobs, I knew that I could get a workable platform together in 2 days to convert the bare-metal machines into dozens of VMs. So that’s what I did.
From the outside, it looked like the code had locked up or got stuck. After reproducing the issue, I decided to add a Prometheus histogram metric to see how often the reconciliation code was being called, and to see how long it took for each call.
The duration of the function was less than 0ms, so it wasn’t hanging. Then I noticed that the count of invocations was increasing at 10 Queries Per Second (QPS).
It turned out that the samples provided by the Kubernetes community use an internal rate-limiter with a value of 10 QPS, it sounds so obvious when you find it, but it took a week to get there.
I left the tests running overnight and saw for 65000 functions, which had not changed, the function had been called 1.1M times. This again was due to a faulty piece of code inherited from the original community sample called “sample-controller”. I spoke to Dims who works on Kubernetes full-time and he sent a PR to resolve the issue so it won’t affect others who are following the sample to build their own controllers.
After my initial testing of 4000 functions on a cluster built with my slicer tool across my workstation and the Ampere Dev Platform, I wanted to go large, and show that we’d fixed the issue. I set up 3x servers on Equinix Metal with an AMD EPYC with 32C/64T and 256GB of RAM.
My configuration was for 5x servers using 16GB of RAM and 8vCPU each, and then the rest of that machine, and the other three were split up into ~ 30 nodes of 2x vCPU and 8GB of RAM.
Put some K3sup (ketchup) on it
As the author and maintainer of K3sup, I knew that K3s would be a really quick way to build out a High Availability (HA) Kubernetes cluster with multiple servers and agents. K3sup is a CLI command with an install and join command which are expected to be used against a single host at a time. There needed to be a way to make this more practical for 60-120 VMs.
That’s where k3sup plan
came into being. My slicer tool can emit a JSON file with the hostname and IP address of its VMs. I took that file from all four servers, combined it into a single file, then ran the new command. The command generates a bash script against each of the VMs, allocating a set value to act as servers via a --servers
flag. The output can then be run using the existing k3sup commands.
A new k3sup plan command for creating huge clusters
This command may make it into the k3sup repository, but I’m still iterating on it. Let me know if it’s something you’d be interested in.
Load balancing the server VMs
The 5x server VMs will load balance the API server, but are only accessible within a private network on the Equinix Metal servers, so I used a TCP load-balancer (mixctl) to expose the private IPs via the server’s public IP:
rules.yaml:
version: 0.1
- name: k3s-api
from: 147.28.187.251:6443
to:
- 192.168.1.19:6443
- 192.168.1.20:6443
- 192.168.1.21:6443
The public IP of the server was then used in the k3sup plan
command via the --tls-san
flag.
There was one other change that I made to k3sup, whenever you join a new node into the cluster, the command first makes an SSH connection to the server to download the “node join token”, then keeps it in memory and uses it to run an SSH command on the new node.
That overwhelmed the server when I ran all 120 k3sup join
commands at once, so now k3sup node-token
will get the token, either into a file or into memory, and can then be passed in via k3sup join --node-token
.
Leader election
I was in two minds to implement lease-based Leader Election, because it’s a divisive topic. Some people haven’t had any issues, but others have had significant downtime and have experienced extra load on the API server due using it.
Lease-based leader election
When three replicas of the Gateway Pod start up, each starts a REST API which can serve invocations, and the REST API for configuring Functions. However, only one of the three replicas should be performing reconciliation of Functions into Deployments and Services, so whichever takes the lease will act as a leader, and the others will just stand by. If the leader gives up the lease due to a graceful shutdown, another will take over. If the leader crashes or a spot instance is terminated, then the lease will expire after 60s and another replica will take over.
Leader Election is optional and disabled by default, but if you are running more than one replica of the gateway, it’s recommended, and prevents noise from conflicting writes or updates, which must in turn be evaluated by the Kubernetes API server and OpenFaaS.
operator:
# For when you are running more than one replica of the gateway
leaderElection:
enabled: true
See also: client-go leader-election sample
QPS and Burst values available in the chart
We’ve made the QPS and Burst values for accessing the Kubernetes API, and for the internal work-queue configurable by Helm, so people with very large clusters or very small ones can tune these values accordingly. We’ve also upped the defaults to sensible numbers.
operator:
# For accessing the Kubernetes API
kubeClientQPS: 100
kubeClientBurst: 250
# For tuning the work-queue for Function events
reconcileQPS: 100
reconcileBurst: 250
Endpoints are replaced with EndpointSlices
EndpointSlices were introduced into Kubernetes to reduce the load generated by service meshes and IngressControllers. Instead of querying a single item, a set of items can be returned for endpoints for a given service.
We’ve switched over. You’ll see a benefit if you run lots of replicas of a function, but it won’t have much effect when there are a large amount of functions with only one replica.
Here’s how you can compare how the two structures look:
# Deploy a function
$ faas-cli store deploy nodeinfo
# Scale it to 5/5 replicas
$ kubectl scale deploy/nodeinfo openfaas-fn --replicas=5
# View the endpoints
$ kubectl get endpoints/nodeinfo -n openfaas-fn -o yaml
# View the slices, you should see one:
$ kubectl get endpointslice -n openfaas-fn | grep nodeinfo
nodeinfo-9ngtv IPv4 8080 10.244.0.165 8s
# Then view the structure of the slice to understand the differences
$ kubectl get endpointslice/nodeinfo-9ngtv -n openfaas-fn -o yaml
In this case, there were 5x endpoints that would have to be fetched from the API, but only one EndpointSlice, making it more efficient to keep in sync as functions scale up and down.
API reads are now cached
There were 2-3 other places where direct API calls were being made during the reconciliation loop or in the HTTP API. For read operations, using a cached informer is more efficient. So we’ve done that and it means already reconciled functions pass through the sync handler in 0ms.
You’ll see a new log message upon start-up such as:
Waiting for caches to sync for faas-netes:EndpointSlices
Waiting for caches to sync for faas-netes:Service
Once the initial cache is filled, a Kubernetes watch picks up any changes and keeps the cache up to date.
Many log messages have been removed
We took the verbosity down a notch or two.
With previous versions, if a Function CR had been created, but not yet reconciled, then the logs for the operator would have printed a message saying the Deployment was not available, whenever a component tried to list functions. That was noise that we just didn’t need so it was taken away.
The same is the case for when a function is deployed via REST API, we used to print out a message saying “Deploying function X”. Well, that’s very noisy when you are trying to create 15000 functions in a short period of time.
Lastly, whenever a function was invoked, we printed out the duration of the execution. We removed the noise, because printing a log statement for each invocation only increases the noise for log aggregators like Loki or Elasticsearch. Imagine how many useless log lines you would have seen from a load test over 5 minutes with 100 concurrent callers?
After having got to 6500 functions without any issues on my own hardware at home, I decided to go large for the weekly Community Call where we deployed 15k functions across 3 different namespaces, with 5000 in each.
The video recording includes a short 4 minute introduction to explain what viewers are going to see, who may not have already read this blog post.
See also: Multiple namespaces
Not only have we fixed the customer issue where the operator seemed to “lock-up” at 3500, functions, but with the knowledge gained by writing actuated, we were able to test 15000 functions in a cost efficient manner using bare-metal hosts on Equinix Metal.
The updated operator has already been released for OpenFaaS Standard and OpenFaaS for Enterprise customers. You don’t have to be running at massive scale to update and get these enhancements.
Kevin, mentioned earlier runs OpenFaaS with much fewer functions, but with a heavier load.
When he saw the video on the community call he remarked:
“The amount of work that has gone into OpenFaaS over the years to support customers is incredible. Good job, really well done.”
Just upgrade your Helm chart to get the latest changes, and if you’d like to use leader election, see the notes earlier in this post or in the values.yaml file under the operator
section.
You may also like:
Do you also need to test at scale - efficiently?
How are you testing your Kubernetes software at massive scale? Do you just run up a 2-3k USD / mo bill and hope that your boss won’t mind? Maybe you are the boss, wouldn’t it be nice to have a long term large test environment always on hand?
If you think you’d benefit from the “slicer” tool I built as part of this support case, please feel free to reach out to me directly.
Here’s a separate video explaining how the slicer works with k3sup plan.
Example slicer config for 3x servers and 10x workers on a machine with 128GB of RAM and 64 threads.
config:
# Total RAM = 128GB
# Total threads/vCPU = 32
host_groups:
- name: servers
count: 3
vcpu: 4
ram_gb: 8
# RAM = 24, vCPU = 12
- name: workers
count: 10
vcpu: 2
ram_gb: 8
# RAM = 80, vCPU = 20
]]>Disclosure: Ampere Computing provided me with the Ampere Developer Platform at no cost, for evaluation and for open source enablement. Ampere Computing is a customer of our secure CI/CD platform actuated.dev.
In this walk-through, we’ll set up a development account on Confluent Cloud for free access to an Apache Kafka cluster suitable for testing and development. We’ll then set up the Kafka Connector which is bundled with OpenFaaS Standard to trigger functions on new messages.
Most of the time we see people publishing JSON, however binary and text data are also supported. So your function will receive a payload in the HTTP body, along with other metadata like the topic name, additional headers, partition and offset.
How it works
The conceptual overview shows subscriptions being managed by the Kafka Connector, rather than replicas of functions.
Instead of managing dozens or hundreds of individual subscriptions between the various replicas of each function, this is managed in the long-lived Kafka connector. This pattern is common across the various connectors including Postgres, AWS SNS/SQS, Cron, etc. In addition to helping manage the number of subscribers per partition, having the subscription managed in a connector means that all functions can be scaled to zero safely.
An OpenFaaS event connector can subscribe to one or many topics, and then invoke functions based upon the messages it receives. The connector is stateless, and can be scaled up or down to match the number of partitions in the topic. If it happens to crash, it’ll pick up again from the last offset that was committed to the partition.
Once a connector is deployed and subscribed to one or more topics, then all you need to do is to update a function with an extra annotation. So if you added the topic: payment.created
annotation to the new-payment
function, from there on it would be invoked with the payload of every message that was published to the payment.created
topic.
There are several common ways to configure Apache Kafka, then a few more esoteric options used by some enterprise companies.
In order of complexity:
In general, hosted providers will always enable TLS, then use either 3) a client certificate or 4) SASL for authentication.
Confluent Cloud uses TLS and SASL. SASL is a username and password. This is the option we’ll be using here, and means creating two secrets in Kubernetes, one for the username, and one for the password.
Aiven uses TLS plus a client certificate using their own self-signed CA, which means creating three secrets, one for the CA, one for the client certificate and one for the private key.
The argument that we tend to hear for 1 or 2, is that a team may be running their stack within a private network or VPC. This model only provides the illusion of security, and is not recommended. It can mean that an eavesdropper or malware that is running within the private environment could potentially gain access to the Kafka cluster.
Head over to Confluent Cloud and sign up as a new customer.
Click on Environments and Default, if Default is not displayed, create it.
Click Clusters or “Add Cluster”
At time of writing, the Basic tier of cluster is free, and more than suitable for testing out the Kafka Connector to see how it works.
If you are concerned about being charged by Confluent for your testing, then pay close attention to any limits or quotas that you may exceed.
Pick a cloud from AWS to Google Cloud to Azure, if you already use one of these vendors, use that one so that you can keep your data in the same region.
Click Cluster Overview, then API Keys and Create Key
For testing, Confluent recommend using the Global access key.
Save the Key as kafka-broker-username.txt
Save the Secret as kafka-broker-password.txt
Add a description such as “openfaas” or “kafka-connector”
Next, click Cluster Settings, and under Endpoints copy the Bootstrap server value.
This will look like pkc-l6wr6.europe-west2.gcp.confluent.cloud:9092
.
Note this down for later user.
Create a topic for testing and name it faas-request
:
For testing, you could set the partition to a smaller value like 3.
Next we’ll configure the Helm chart, and use the topic name in the configuration.
Add the OpenFaaS Helm chart repository to Helm, then update your repositories:
helm repo add openfaas https://openfaas.github.io/faas-netes/
helm repo update
The Helm Chart for the Kafka Connector can be installed in two ways:
All settings are configured through a values.yaml file, and the end of the README file has a reference explaining all the various options.
Now create a values.yaml file, and add each of the below sections:
For the topic or topics, provide either a single topic, or a comma-separated list of topics.
topics: faas-request
You should also set the content type that you expect messages to be encoded in:
contentType: text/plain
The most common option is application/json
, but you can also use text/plain
or application/octet-stream
for binary data.
Your function’s handler will receive the message as the body of the HTTP request, and a number of additional headers defined in the docs.
Next add the address of the Kafka Cluster or Broker:
brokerHosts: pkc-l6wr6.europe-west2.gcp.confluent.cloud:9092
If you have more than one bootstrap server in the cluster, you can specify each in a comma-separated list.
Make sure you change this value, do not use the above value, which is from our own test cluster.
Since we know that both TLS and SASL are enabled on Confluent Cloud, we should now add:
tls: true
saslAuth: true
Finally, add a secret for the username and password that we saved earlier:
kubectl create secret generic \
kafka-broker-username \
-n openfaas \
--from-file broker-username=~/kafka-broker-username.txt
kubectl create secret generic \
kafka-broker-password \
-n openfaas \
--from-file broker-password=~/kafka-broker-password.txt
Depending on how many times you want to install the Kafka connector, you may wish to change the name of the installation in Helm (kafka-connector
), or if you are using a single topic, you can leave it as-is.
helm upgrade kafka-connector openfaas/kafka-connector \
--install \
--namespace openfaas \
-f values.yaml
Check the logs of the connector:
kubectl logs -n openfaas deploy/kafka-connector
If it is not loading, then you may have missed a secret, check this by running kubectl describe -n openfaas deploy/kafka-connector
followed by kubectl get events -n openfaas --sort-by=.metadata.creationTimestamp
.
There are several other ways to troubleshoot the connector, by turning on verbose and debug logging.
Update values.yaml and then run the helm upgrade
command again:
logs:
# Log debug messages
debug: true
# Print the data read from the Kafka topic before invoking functions
printRequestBody: true
# Print the data received from invoked functions
printResponseBody: true
It’s recommended to turn off those settings for production, when you’ve resolved any issue that you’re facing.
The Kafka topic we created was called faas-request, we can have a function subscribe to this topic by adding a topic
annotation.
The printer function from the OpenFaaS Store will show the message that it received in its logs along with any additional headers.
Create a stack.yml file:
provider:
name: openfaas
functions:
printer:
skip_build: true
image: ghcr.io/openfaas/printer:latest
annotations:
topic: faas-request
Now run faas-cli deploy
to deploy the function.
Alternatively, you can deploy without a YAML file: faas-cli store deploy printer --annotation topic=faas-request
Now over on the Confluent Dashboard, navigate to the Topics, then faas-request, Messages then Produce new message to this topic.
Produce a message on the topic
Navigate to the OpenFaaS Standard Dashboard, and click on the printer function, then Logs.
Viewing the logs of the invocation
In addition to the body, you’ll also note a number of headers, these are explained in more details in the Kafka Connector docs
If you have existing systems that publish messages to Apache Kafka, you’ll be able to configure the connector to start sending those messages to functions.
However, if you do not currently have any message producers, then you can publish messages from a function by using a Kafka client library, such as confluent-kafka for Python or Sarama for Golang, etc.
Bear in mind that most Python libraries for Kafka will use librdkafka, which is a C++ module, and will likely build from source when building your function. For that reason, you should use an OpenFaaS template based upon Debian Linux, which includes a C++ toolchain.
Code samples for producing messages in different languages
By clicking “Add Client” in the Confluent Dashboard, you can see the code samples for producing messages in different languages, and also discover different SDKs.
To produce messages on the faas-request
topic, run the following:
# Replace the below value with your own registry and username
export OPENFAAS_PREFIX=ttl.sh/openfaas-tutorial
faas-cli template store pull python3-http-debian
faas-cli new --lang python3-http-debian producer
echo "confluent-kafka" > producer/requirements.txt
Create two secrets for the function:
faas-cli secret create kafka-broker-username \
--from-file ~/kafka-broker-username.txt
faas-cli secret create kafka-broker-password \
--from-file ~/kafka-broker-password.txt
Update producer.yml
:
functions:
producer:
lang: python3-http-debian
handler: ./producer
image: ttl.sh/openfaas-tutorial/producer:latest
### Add/customise the below
environment:
kafka_broker: "pkc-l6wr6.europe-west2.gcp.confluent.cloud:9092"
secrets:
- kafka-broker-username
- kafka-broker-password
Then write a handler:
from confluent_kafka import Producer
import socket, os
def handle(event, context):
username = get_secret('kafka-broker-username')
password = get_secret('kafka-broker-password')
broker = os.getenv("kafka_broker")
conf = {
'bootstrap.servers': broker,
'security.protocol': 'SASL_SSL',
'sasl.mechanism': 'PLAIN',
'sasl.username': username,
'sasl.password': password,
'client.id': socket.gethostname()
}
producer = Producer(conf)
topic = 'faas-request'
producer.produce(topic, value=event.body)
producer.flush()
return {
"statusCode": 200,
"body": "Message produced"
}
def get_secret(name):
ret = None
with open("/var/openfaas/secrets/" + name, "r") as file:
ret = file.read().strip()
return ret
Run faas-cli up
to deploy the function. Whatever body you use to make a HTTP POST will be published to the topic.
Example messages published via the function:
Example messages showing up on the topic in the Confluent Dashboard
We’ve now configured a development Kafka cluster on Confluent Cloud, which should be free to keep running for low usage and testing. We then configured the Kafka connector with TLS and SASL, then deployed a function to receive messages from the topic, and viewed its logs in the OpenFaaS Dashboard.
Scaling and retries
“The rule in Kafka is a maximum of 1 consumer per partition (as each partition must only be allocated to 1 consumer), so you can only have as many consumers (in a single consumer group) as there are partitions for a topic, but you can also have less.” instacluster by NetApp
We recommend installing the connector once per topic for production use, changing the Helm installation so that you can have multiple instances of the connector running in the same cluster. Then scale the Kafka connector deployment to match the size of the partition.
So if the faas-request
topic has 3 partitions, then you should have 3 replicas of the Kafka connector running. The replicas can be set in the values.yaml file or by running kubectl scale -n openfaas deploy/kafka-connector --replicas=3
.
If the connector crashes for some reason, or the Pod is scheduled to a different node, then Kubernetes will automatically restart it, and it’ll pick up from the last message it processed.
For retries, set the asyncInvoke
option to true
, so that consumed messages get put into the NATS JetStream queue and retried according to the policy you’ve defined. There are more advanced options covered in the docs and Helm chart, but what we’ve covered today should cover 80% of the use-cases for triggering functions from Kafka.
If you have any further questions, please feel free to get in touch with us.
You may also like:
]]>We’ll start by showing how testing functions on your own machine can help you iterate much more quickly than deploying each change to Kubernetes. That’s where faas-cli local-run
comes in. Then, we’ll show the new --watch
functionality in faas-cli up
, for when it makes more sense to test and iterate within a cluster due to dependencies on other services.
This is what the typical development lifecycle of an OpenFaaS function looks like:
We try to minimise the amount of manual steps by bundling some of these actions into a single command. For example, running faas-cli up
will build, push and deploy functions.
The disadvantage of this workflow is that it can introduce some delay as you will have to wait for the function image to be pushed to a registry, pulled into a node and started up each time you make changes to your code.
Thanks to some recent work from the community a couple of new features were added to the faas-cli to further improve the development experience.
For fast local iteration on functions, a new command faas-cli local-run
was added. The command runs a function as a Docker container directly on your machine that spins up pretty much instantly. You won’t have to wait for the function to be deployed to OpenFaaS and become ready before you can invoke it.
A second new feature is the addition of the --watch
flag. It can be used with both the faas-cli up
as well as the faas-cli local-run
command and tells the CLI to watch the filesystem so it can automatically build and redeploy functions as you edit and save your code.
The OpenFaaS CLI has a command, local-run
, that allows users to test functions without deploying. It builds a function into your local image library and starts a container locally with Docker.
This has the advantage that you won’t have to wait for the image to be pushed to a registry, then pulled into a node and started up.
Create a function or use an existing one and try to run it locally with faas-cli local-run
.
We will create and run a simple nodeJs function:
# Create a new function using the node18 template
faas-cli new greeter --lang node18
# Rename the the functions yaml definition to stack.yml
mv greeter.yml stack.yml
Update greeter/handler.js
so that the function returns a nice greeting message.
'use strict'
module.exports = async (event, context) => {
return context
.status(200)
.succeed(`Greetings from OpenFaaS!!!`)
}
Run the function locally:
faas-cli local-run
The command will first build the function and next run it locally with docker.
The output should look something like this:
#23 exporting to image
#23 exporting layers done
#23 writing image sha256:c384939cb1d69d510c6f1237e371aac21deb2e4ac3f9bc863852084dfda7b20a done
#23 naming to docker.io/library/echo:latest done
#23 DONE 0.0s
Image: echo:latest built.
[0] < Building echo done in 0.66s.
[0] Worker done.
Total build time: 0.66s
Image: echo:latest
Starting local-run for: echo on: http://0.0.0.0:8080
2023/09/05 15:58:55 Version: 0.9.11 SHA: ae2f5089ae66f81a1475c4664cb8f5edb6c096bf
2023/09/05 15:58:55 Forking: node, arguments: [index.js]
2023/09/05 15:58:55 Started logging: stderr from function.
2023/09/05 15:58:55 Started logging: stdout from function.
2023/09/05 15:58:55 Watchdog mode: http fprocess: "node index.js"
2023/09/05 15:58:55 Timeouts: read: 15s write: 15s hard: 10s health: 15s
2023/09/05 15:58:55 Listening on port: 8080
2023/09/05 15:58:55 Writing lock-file to: /tmp/.lock
2023/09/05 15:58:55 Metrics listening on port: 8081
node18 listening on port: 3000
Once the container is running, curl
can be used to invoke the function:
curl http://127.0.0.1:8080
You should see your greeting message in the response printed to the console.
Function logs for each invocation can also be inspected in the console:
2023/09/05 15:58:55 Metrics listening on port: 8081
node18 listening on port: 3000
2023/09/05 16:00:07 POST / - 200 OK - ContentLength: 96B (0.0353s)
By default the function container publishes port 8080. The --port
flag can be used to change the port in case you are already port-forwarding the OpenFaaS gateway or when port 8080 is not available for another reason.
faas-cli local-run greeter --port 3001
This will run the greeter function and make it available on port 3001
.
curl -i http://127.0.0.1:3001
The local-run command is great for running and testing individual OpenFaaS functions but it can only run a single function at a time.
If your stack.yaml file only contains a single function, local-run will run that function by default. When there are multiple functions you need to add the name of the function you want to run as an extra argument to the command.
Create a second function, echo
and append it to the stack.yml file.
faas-cli new greeter --lang node18 --append stack.yml
faas-cli local-run greeter
Since functions are running as an individual Docker container with local-run you can not talk to other functions or the OpenFaaS gateway.
If you are building function pipelines where you need to talk to other functions or if you need to call other services in your cluster, local-run might not be the best option.
While there are some workarounds like port-forwarding the gateway first and making the gateway url configurable in your function through an environment variable you might want to use faas-cli up --watch
instead.
Running faas-cli up
will build, push and deploy all functions in the stack.yml
file. The --watch
flag will tell the faas-cli to monitor the function source files for any changes and automatically rebuild and redeploy functions as you edit and save your code.
We take a more detailed look into the watch functionality later in this article.
All functions can consume secrets in the same way, by reading a file from: /var/openfaas/secrets/NAME
To mount a secret in a function the secret name has to be added to the list of secrets in the stack YAML file.
As an example we will add a secret named api-key
to the echo function:
functions:
echo:
lang: node18
handler: ./echo
image: ttl.sh/openfaas/echo:latest
secrets:
- api-key
The local-run command looks for secret files in the .secrets
folder. You will need to create any secrets you want in this location.
All secrets included in a functions stack.yaml will be mounted into the function container so they can be read from their usual location,`/var/openfaas/secrets/NAME, and used within the function.
Create the .secrets
folder in your local directory and add a file named api-key.
mkdir .secrets
echo "secret-access-token" > .secrets/api-key
We will use the secret to add authorization to the echo function. Update ./echo/handler.js
to read the secret file from /var/openfaas/secrets/api-key
and validate the Authorization header against the api-key:
'use strict'
const { readFile } = require('fs').promises
module.exports = async (event, context) => {
// Read the api-key from an OpenFaaS secret.
let token = await readFile("/var/openfaas/secrets/api-key", )
const result = {
'body': JSON.stringify(event.body),
'content-type': event.headers["content-type"]
}
let authHeader = event.headers["authorization"]
let authToken = ""
// Get the bearer token from the authorization header.
if (authHeader.startsWith("Bearer ")){
let parts = authToken.Split(" ")
authToken = parts[1]
} else {
return context
.status(400)
.succeed("Invalid authorization header")
}
// Verify bearer token matches the api-key.
if (authToken != token) {
return context
.status(401)
.succeed("Invalid api key")
}
return context
.status(200)
.succeed(result)
}
Run the function with local-run and see that the secret is mounted in the container and can be read for use within the function. A HTTP 401 status code should be returned if you invoke the function with an invalid api-key.
Run the echo function:
faas-cli local-run echo
Invoke the function with in invalid api-key:
$ curl -s http://127.0.0.1:8080 \
-H "Authorization: Bearer invalid"
-H "Content-Type: text/plain" \
-d "Greetings from OpenFaaS"
HTTP/1.1 401 Unauthorized
Connection: keep-alive
Content-Length: 15
Content-Type: text/html; charset=utf-8
Date: Thu, 07 Sep 2023 11:20:50 GMT
Etag: W/"f-DQI+tV0HvT0NM/9khJk6PQXU+K4"
Keep-Alive: timeout=5
X-Duration-Seconds: 0.003185
Invalid api key%
Invoking the function with the correct api-key that we saved in the secret should return a HTTP 200 status code.
$ curl -s http://127.0.0.1:8080 \
-H "Authorization: Bearer secret-access-token"
-H "Content-Type: text/plain" \
-d "Greetings from OpenFaaS"
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 66
Content-Type: text/html; charset=utf-8
Date: Thu, 07 Sep 2023 11:15:30 GMT
Etag: W/"42-ql9Q2lRIiMOJhodGVWvMjGR/Ezw"
Keep-Alive: timeout=5
X-Duration-Seconds: 0.004418
{"body":"\"Greetings from OpenFaaS\"","content-type":"text/plain"}%
Another feature that can be helpful to iterate on functions quickly during development is the built in watch functionality of the CLI.
The --watch
flag can be used with both the local-run
and up
command. Adding the flag will tell the cli to watch the filesystem for any changes to the function source files and automatically re-build and deploy functions on save.
When using --watch
flag with faas-cli up
it is recommended to also set --tag=digest
. This ensures unique image tags are generated for each build. The next section goes into more detail about the --tag
flag.
We find it convenient to use the temporary registry ttl.sh instead of the Docker Hub for quick testing and prototyping. It’s a little slower, however at the same time it doesn’t require you to log in and any images you push get deleted after 24 hours.
To configure a registry when you create a new function, set OPENFAAS_PREFIX
.
To use ttl.sh, without authentication, and temporary images:
OPENFAAS_PREFIX=ttl.sh/my-project
To use the Docker Hub:
OPENFAAS_PREFIX=docker.io/my-user
All OpenFaaS functions are built into container images. By default if no image tag is included for a function in the stack.yml file the :latest
tag is used. When iterating over functions and pushing them to an image registry it is a best practice to organise different image versions using tags instead of always pushing to :latest
.
There are two options to set tags for function images.
The --tag
option can be used with the build
, push
and deploy
sub-commands of the faas-cli. If this flag is provided, image tags for functions will automatically be generated based on available metadata. This can be either Git metadata like the commit sha or branch name or digest of the function handler content.
The generated tag is always suffixed to any tag defined in the stack.yml file or latest
if no tag is defined.
Some examples:
When using the flag --tag=sha
the image tag used in the stack.yml file is suffixed with the short Git SHA. e.g
functions:
echo:
image: ttl.sh/openfaas/echo:0.2
For this stack.yml file the resulting image name will be echo:0.2-cf59cfc
If no tag is set in the stack.yml file the suffix is appended to latest.
image: echo => image: echo:latest-cf59cfc
If you are using faas-cli up
with the --watch
flag we recommend also setting --tag=digest
. The digest is calculated from the function source code and will ensure a unique image tag is generated and pushed for every code change.
Find an overview of all the available tag versions in the docs.
Alternatively environment variable substitution can be used to set the image.
Here is an example of a stack.yml file:
functions:
echo:
image: ttl.sh/openfaas/echo:${FN_VERSION:-latest}
The value of FN_VERSION
can be set through en environment variable when running commands like faas-cli up
, build
or publish
:
FN_VERSION="0.2" faas-cli build
Environment variable substitution and the --tag
flag can also be used together.
We set out to show you how the new local-run
command worked for faster iteration, the --watch
flag for live-reloading as you edit, and the --tag=digest
flag to generate dynamic tags as you edit.
When it comes to iterating on functions, doing a full deployment with faas-cli up is the easiest and most realistic way to test your work, however, faas-cli local-run can speed things up if your function doesn’t have a lot of dependencies.
You may also like:
Feel free to tweet to @openfaas with your comments, questions and suggestions.
]]>Did you know? Linode was acquired by Akamai, and is now being branded as “Akamai Cloud Computing”. The rebranding is still in-progress, so we’ll be referring to Linode throughout this article.
K3s is a production-ready distribution of Kubernetes that was originally developed by Darren Shepherd at Rancher Labs, before donating it to the Cloud Native Computing Foundation (CNCF). It’s become one of the most popular ways to run Kubernetes on-premises, at the edge, and on IoT devices. So why would you run it on Linode when Linode already offers its own Linode Kubernetes Engine (LKE)?
Both K3s and LKE can be used on Linode to run Kubernetes, but they have different use-cases. LKE is a managed service, so Linode is responsible for maintaining the control plane and upgrading it for you. K3s is a lightweight distribution of Kubernetes that is designed to be easy to install and maintain, and is ideal for running on smaller hosts. Using K3s also means that whatever we setup on Linode, can be set up on-premises or even in our homelab too.
OpenFaaS is one of the earliest Functions As a Service (FaaS) frameworks for Kubernetes, is listed on the CNCF Landscape, and has many open source and commercial adopters running in production.
When you write a function, you focus on a HTTP handler, rather than on boiler-plate coding. You tend to get functions triggered by event sources like Cron, HTTPS requests, asynchronous queues and messages buses like Apache Kafka or RabbitMQ.
A quick example function in Python, which reads all rows from a Postgresql users table:
import psycopg2
def handle(event, context):
password = get_secret("db_password")
try:
conn = psycopg2.connect("dbname='main' user='postgres' port=5432 host='192.168.1.35' password='{}'".format(password))
except Exception as e:
print("DB error {}".format(e))
return {
"statusCode": 500,
"body": e
}
cur = conn.cursor()
cur.execute("""SELECT * from users;""")
rows = cur.fetchall()
return {
"statusCode": 200,
"body": rows
}
Here’s what people tend to value in OpenFaaS over a hosted functions service:
Finally, we often hear that teams can both get into production with OpenFaaS in a very short period of time (days) and that they often save costs. In one case, a US-based satellite company saved 180k USD over three years after switching away from AWS Lambda.
You can find a list of companies and their use-cases in the ADOPTERS file, however this is only a very small sub-set of users.
Disclosure: at the time of writing, Linode sponsors the OpenFaaS homepage and provides credits for testing the OpenFaaS project. This article was commissioned by Linode/Akamai.
New customers can get free credit with Linode to try out this tutorial.
There are many knobs and dials to configure Kubernetes or K3s for production. We won’t be covering each and every option, because each team’s requirements will vary so much. Instead we’ll focus on creating a High Availability (HA) cluster, secure Ingress with TLS encryption, and then we’ll deploy OpenFaaS to it.
Highly Available K3s cluster, with a Load Balancer
For a HA control-plane, K3s supports using a database or an embedded etcd cluster.
We’ll go through the following steps:
From there it’s up to you to decide which parts you may want to automate with a GitOps or IaaC tool such as Flux for the Helm charts, or Terraform for the VMs themselves.
Before we get started, I’d advise using my arkade tool to download all the various CLIs that we’re going to need.
curl -sLS https://get.arkade.dev | sh
Follow the command to move arkade to /usr/local/bin/
using sudo
.
Then:
arkade get \
terraform \
faas-cli \
kubectl \
helm
arkade is a time-saver for both downloading developer tools, but also for installing Helm charts, which we will see in the later steps, when we’ll run commands like arkade install cert-manager
. If you look carefully at the output, you’ll see that it’s a wrapper for the Helm command itself.
See also: Use Terraform to Provision Infrastructure on Linode
On Linode, VMs are called “Linodes”, but we will be referring to them as VMs to avoid ambiguity.
We will need to configure both private and public networking for the VMs, so that K3s itself doesn’t send all of its control-plane traffic over the public internet. I didn’t do this with my initial testing and saw over 250GB of traffic between the three VMs over the course of a week. This is normal for Kubernetes, but it needs to run over a private network which is free and unmetered.
I didn’t realise this initially, but if you use a private IP address for your VMs on Linode, they end up being exposed to every other VM in that region, but hidden from the Internet. So what we actually want is a VLAN, along with a private IP address, that way they’re private within our own account.
Linode VLANs operate at Layer 2 of the OSI model, and you can have up to 10 of them per region. Each VM can belong to a maximum of three separate VLANs.
The Terraform to create the VMs is rather verbose and complicated, however here’s the gist of it:
g6-dedicated-2
plan for 2x dedicated vCPUs and 4GB RAMThe complete Terraform script is available here: alexellis/k3s-linode
See also:
You can find more detailed documentation on Linode’s interface configuration here: Guides - Create a Private Network with VLANs Using Linode’s API
You’ll also want to create a main.tfvars
file with the token created from within the Linode dashboard:
api_token = "xyz"
It doesn’t seem possible to create a VLAN via Terraform, so you’ll need to create an instance, attach a VLAN, and then delete the instance. The VLAN will remain, and can then be referenced by Terraform. If the Linode team is listening, it’d be nice to have an API or CLI command for this in the future.
“VLANs can be configured when creating new instances or by modifying the network interfaces on the Configuration Profile of an existing instance” (source)
To create the VMs, run:
terraform apply -var-file ./main.tfvars
You’ll get the server IPs printed out as follows - bear in mind that the values may not be ordered alphabetically, so pay extra attention when copying and pasting values.
Outputs:
nodebalancer = "139.144.247.125"
servers = {
"48521666" = {
"label" = "k3s-server-3"
"public_ip" = "139.162.250.98"
"vlan_ip" = "192.168.3.3"
}
"48521667" = {
"label" = "k3s-server-2"
"public_ip" = "176.58.106.122"
"vlan_ip" = "192.168.3.2"
}
"48521668" = {
"label" = "k3s-server-1"
"public_ip" = "176.58.106.241"
"vlan_ip" = "192.168.3.1"
}
}
Now that you have the IP addresses for the VMs, you can build the k3sup commands to perform the installation.
K3sup is an open-source tool I wrote to install K3s over SSH, it makes managing all the configuration much simpler, and within a very short period of time, you can have a HA cluster up and running, with a Load Balancer providing a stable IP address for accessing the cluster via kubectl.
With k3sup, there is no need to log into your VMs, or to run any commands on them. K3sup does everything, including fetching a kubeconfig file and merging it into your existing one, so that you can access the cluster with kubectl.
Example installation of K3s with K3sup
Setup the first server:
export CHANNEL="latest"
export USER=root
export TLS_SAN="139.144.247.125"
export SERVER_IP="176.58.106.241"
export SERVER_VLAN_IP="192.168.3.1"
k3sup install \
--cluster \
--ip $SERVER_IP \
--user $USER \
--k3s-channel $CHANNEL \
--merge \
--local-path $HOME/.kube/config \
--context k3s-openfaas \
--k3s-extra-args "--node-ip $SERVER_VLAN_IP --node-external-ip $SERVER_IP --flannel-iface eth1 --disable=traefik" \
--tls-san $TLS_SAN
We specify additional arguments for the kubelet including:
This creates your KUBECONFIG and merges the cluster under a new context name:
kubectx k3s-openfaas
If you get anything wrong, log in with SSH and remove k3s using sudo /usr/local/bin/k3s-uninstall.sh
. You shouldn’t need to reboot, but it may help if things are not working as expected.
Running sudo systemctl cat k3s
is also useful for checking that the server IP and node local IP addresses are set correctly.
Confirm that the INTERNAL-IP and EXTERNAL-IP fields are populated with the VLAN IP and Public IP respectively:
kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k3s-server-1 Ready control-plane,etcd,master 6s v1.27.4+k3s1 192.168.3.1 176.58.106.241 Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.7.1-k3s1
Then install the second server:
export EXTRA_SERVER_IP="176.58.106.122"
export EXTRA_SERVER_VLAN_IP="192.168.3.2"
k3sup join \
--server \
--server-ip $SERVER_IP \
--ip $EXTRA_SERVER_IP \
--user $USER \
--k3s-channel $CHANNEL \
--k3s-extra-args "--node-ip $EXTRA_SERVER_VLAN_IP --node-external-ip $EXTRA_SERVER_IP --flannel-iface eth1 --disable=traefik" \
--tls-san $TLS_SAN
Verify that the server was added as expected with: kubectl get node -o wide --watch
.
Confirm that the IP addresses are correct and that the second server is in a Ready status:
kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k3s-server-1 Ready control-plane,etcd,master 6m57s v1.27.4+k3s1 192.168.3.1 176.58.106.241 Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.7.1-k3s1
k3s-server-2 Ready control-plane,etcd,master 10s v1.27.4+k3s1 192.168.3.2 176.58.106.122 Ubuntu 22.04.2 LTS 5.15.0-73-generic containerd://1.7.1-k3s1
Now, finally add the third server:
export EXTRA_SERVER_IP="139.162.250.98"
export EXTRA_SERVER_VLAN_IP="192.168.3.3"
k3sup join \
--server \
--server-ip $SERVER_IP \
--ip $EXTRA_SERVER_IP \
--user $USER \
--k3s-channel $CHANNEL \
--k3s-extra-args "--node-ip $EXTRA_SERVER_VLAN_IP --node-external-ip $EXTRA_SERVER_IP --flannel-iface eth1 --disable=traefik" \
--tls-san $TLS_SAN
As before, verify that the third server has been added.
With K3s, the costs can be kept quite low because the servers running the control-plane can also run user workloads. However, if you expect very heavy use or I/O intensive applications, then you could also add some agents to the cluster.
This exercise is left for the reader, you could either duplicate the terraform, and replace the word “server” for “agent”, or adapt it so that you input the number of servers and the number of agents separately. Another option is to create an agent via the Linode CLI or UI.
Once your VM is created, use the server IP of any of the three machines under the --server-ip
flag. After it joins the cluster, K3s will tell it about the other server IPs in the case that one of them goes down.
export SERVER_IP="176.58.106.241"
export AGENT_IP="109.74.199.152"
export AGENT_VLAN_IP="192.168.3.4"
export USER=root
export CHANNEL="latest"
k3sup join \
--server-ip $SERVER_IP \
--ip $AGENT_IP \
--user $USER \
--k3s-channel $CHANNEL \
--k3s-extra-args "--node-ip $AGENT_VLAN_IP --node-external-ip $AGENT_IP --flannel-iface eth1"
The agent will show up on the output from kubectl get node
:
kubectl get node
NAME STATUS ROLES AGE VERSION
k3s-agent-1 Ready <none> 18s v1.27.4+k3s1
k3s-server-1 Ready control-plane,etcd,master 8d v1.27.4+k3s1
k3s-server-2 Ready control-plane,etcd,master 8d v1.27.4+k3s1
k3s-server-3 Ready control-plane,etcd,master 8d v1.27.4+k3s1
In this section we’ll install the control-plane components, and OpenFaaS.
Then we’ll deploy a function in the following section.
Conceptual architecture for OpenFaaS control-plane
OpenFaaS will deploy several other components that are not pictured above:
We’ll use ingress-nginx for our Ingress Controller and cert-manager to obtain and renew Let’s Encrypt TLS certificates for our Ingress Controller. This will allow us to access our functions over HTTPS, along with anything else we may want to deploy to the cluster.
arkade install ingress-nginx
Follow this up with:
arkade install cert-manager
This is an important step, and due to the way that cert-manager does its self-checks for ACME HTTP01 challenges.
Edit the service for Ingress Nginx, then add the following to the spec:
kubectl edit svc/ingress-nginx-controller
spec:
+ externalIPs:
+ - 139.144.247.125
Replace 139.144.247.125
with the IP address of the NodeBalancer.
cert-manager will be used in the next stage to obtain a TLS certificate for the OpenFaaS Gateway and UI.
Next install OpenFaaS with either the Community Edition (CE) or one of the versions designed for production and commercial use: OpenFaaS Standard or OpenFaaS for Enterprises.
For commercial versions of OpenFaaS, we recommend installing via the OpenFaaS Helm chart, and keeping a copy of your values.yaml file safe for future upgrades.
OpenFaaS CE can also be installed very quickly with the arkade tool. arkade is a wrapper for the Helm chart which reduces all the steps down to a single command:
arkade install openfaas
Now, create a DNS A record for the NodeBalancer’s IP address i.e. openfaas.example.com
.
Next, you can create a TLS record for the OpenFaaS Gateway and UI:
export DOMAIN=example.com
arkade install openfaas-ingress \
--email webmaster@$DOMAIN \
--domain openfaas.$DOMAIN
If you want to create Kubernetes YAML files for the Ingress, instead of using arkade, then see these instructions: TLS for OpenFaaS.
You can now run arkade info openfaas
to get the instructions to log in with the CLI and to how to get the password to access the UI.
Instead of using the suggested port-forwarding, you’ll be able to use your TLS-enabled URL to access the UI and CLI.
echo Access the UI at: https://openfaas.$DOMAIN
echo Login in with:
PASSWORD=$(kubectl get secret -n openfaas basic-auth -o jsonpath="{.data.basic-auth-password}" | base64 --decode; echo)
echo $PASSWORD | faas-cli login --password-stdin --gateway https://openfaas.$DOMAIN
Check it worked by deploying the nodeinfo function from the store:
export OPENFAAS_URL=https://openfaas.example.com
faas-cli store deploy nodeinfo
faas-cli describe nodeinfo
echo | faas-cli invoke nodeinfo
You should see the invocation count increase when running the following:
faas-cli list
Function Invocations Replicas
nodeinfo 1 1
The aim of this tutorial is to focus on the infrastructure, however since it’s relatively quick, we’ll also create a custom Python function and deploy it to the cluster.
Every function will be built into a container image and published into a container registry, then when it’s deployed a fully qualified image reference is sent as to the Kubernetes node. Kubernetes will then pull down that image and start a Pod from it for the function.
In production, you’re going to need to use a private registry, or a public registry with authentication enabled.
Follow the steps here to set it up: Configure a private registry
Next, pull down the Python HTTP templates from the store:
faas-cli template store pull python3-http
Create a new function, then rename its YAML file to stack.yml, we do this so we don’t need to specify the name using --yaml
or -f
on every command. A stack.yml file can contain multiple functions, but we’ll only be using one right now.
See also: stack.yaml reference
# Change this line to your own registry:
export OPENFAAS_PREFIX="docker.io/alexellis2"
faas-cli new --lang python3-http \
ping-url
mv ping-url.yml stack.yml
We’ll use the requests library to make a HTTP request to any passed in URL to the function.
Edit ping-url/requirements.txt
and add the following line:
requests
Next, edit ping-url/handler.py
and replace the contents with the following:
import requests
import sys
def handle(event, context):
url = event.body.decode("utf-8")
if not url:
return {
"statusCode": 400,
"body": "Please provide a URL to ping"
}
body = ""
statusCode = 200
try:
res = requests.get(url)
body = res.text
except Exception as e:
sys.stderr.write("Error reaching remote server {}".format(str(e)))
sys.stderr.flush()
return {
"statusCode": 500,
"body": "Error: " + str(e)
}
return {
"statusCode": 200,
"headers": {
"Content-Type": "application/json",
},
"body": {
"remoteBody": body,
"remoteStatusCode": statusCode,
}
}
Run the following to test on your own machine:
faas-cli local-run
This is a convenient way to test functions without deploying them into the cluster, any secrets that you add to a function should be written into a .secrets folder, and most other things will work, apart from if you are connecting to services within the remote cluster itself. When using this mode, trim off the “/function/” prefix that is used to invoke OpenFaaS functions.
Or you can deploy it straight to the Kubernetes cluster using faas-cli
:
faas-cli up
Then, invoke the function when ready.
Every time I change the function, I like to have a new image tag, to make sure Kubernetes will definitely update the function. You can do this by editing the image
field in the YAML file, or by using the --tag digest
command. If you’re making a git commit
between each change, you can also use --tag sha
to replace the tag with the commit SHA.
Here’s an example of the image name for --tag digest
: docker.io/alexellis2/ping-url:d5f20526c2685e92bad718f54a74f338
We can access any website such as Wikipedia:
$ curl -i -s https://openfaas.example.com/function/ping-url -d "https://wikipedia.org"|head -c 500
HTTP/1.1 200 OK
Content-Length: 97541
Content-Type: application/json
Date: Fri, 18 Aug 2023 09:51:41 GMT
Server: waitress
X-Duration-Seconds: 0.226038
{"remoteBody":"<!DOCTYPE html>\n<html lang=\"en\" class=\"no-js\">\n<head>\n<meta charset=\"utf-8\">\n<title>Wikipedia</title>\n<meta name=\"description\" content=\"Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.\">\n<script>\ndocument.documentElement.classN
Or, we can even access the built-in health check of the function itself:
$ curl -i https://openfaas.example.com/function/ping-url -d "http://127.0.0.1:8080/_/health"
HTTP/1.1 200 OK
Content-Length: 43
Content-Type: application/json
Date: Fri, 18 Aug 2023 09:49:25 GMT
Server: waitress
X-Duration-Seconds: 0.002562
{"remoteBody":"OK","remoteStatusCode":200}
This template uses Flask under the hood for efficiency, and you can get more instructions on how to use a database and so forth in its repository: openfaas/python-flask-template
You can find out how many invocations the function has had by running faas-cli list
, or faas-cli describe ping-url
.
There’s also a Grafana dashboard available for the Community Edition, and 4 extra ones available for OpenFaaS Standard and For Enterprises. We find this essential for finding out whether there are any issues with CPU/memory usage, running for too long, or any errors that are occurring.
Dashboard for OpenFaaS Standard
See also: OpenFaaS Grafana dashboards
I have written two eBooks that cover writing functions for OpenFaaS, one specialises in Node.js / JavaScript and is called Serverless For Everyone Else, and the second uses primarily Golang (Go) and is called “Everyday Golang”. You can buy either or both in the OpenFaaS Store and GitHub sponsors on certain tiers get a 20% discount on them.
In a relatively short period of time, we built a production-grade K3s cluster, with a High Availability control-plane, and an IP address that would balance traffic between each of the three servers. We then installed an Ingress Controller and obtained a TLS certificate for it, before finally installing OpenFaaS and deploying a custom function.
If you want to trigger a function on a timed basis, such as with Cron, you should check out the cron-connector, which is covered in detail in my eBook Serverless For Everyone Else.
As further work for the reader, you could adapt the Terraform script to also create a number of workers, or agents as K3s calls them. Do this either by adding a new section or by making a copy of the file, and replacing the word “server” with “agent”.
Today we only scratched the surface, there are many different event triggers, language templates and ways to run functions - both synchronously, or out of band in a queue with the highly parallelised async mode.
It’s also worth noting that if you plan on serving traffic in a bursty fashion, where there may be millions of requests per minute, followed by periods of almost no traffic, then Linode’s LKE service may be a better fit than K3s, because it’s possible to scale the amount of VMs automatically which make up the cluster. More nodes, means more capacity to serve traffic.
New Linode customers can get free credit to try out this tutorial with K3s or LKE.
Learn more:
]]>Around 2017 Kubernetes adopted a new pattern called the Custom Resource Definition (CRD), this was a graduation of a prior work called Third Party Resources. Prior to either of these efforts, any vendor wanting to integrate with Kubernetes had to work with all the standard APIs and somehow decorate them with metadata to show that they belonged to a certain integration.
With the CRD, came the Operator pattern. You create a Custom Resource (an instance of a CRD) and get to have a descriptive name like “Function” or “Tunnel” (in the case of inlets). You can then build a so-called Operator whose sole purpose is to search out for instances of this Custom Resource, and then to create or update native Kubernetes APIs.
When I saw that Kubernetes was dominating the market, and that Docker Swarm was unfortunately on the way out, I wrote the first version of “faas-netes”. faas-netes was the cornerstone of the Kubernetes support, and because everything had been built in such a modular way, it was really the only thing we had to change, that and creating a Helm chart.
The first version of faas-netes had a REST API that would create a Deployment and a Service for every function a user deployed. The list functions HTTP handler would search out Deployment objects with a certain label i.e. faas_function
and then filter on that. We call this version of of the code “the controller” and it has an imperative API - you tell it what to do, and it has to do it right then and there.
When we built a Function Custom Resource Definition, that meant we had to revisit the code and build an operator that would watch for instance of the Function CRD and then create a Deployment and Service just like the “controller” mode did.
Conceptual diagram - the Operator mode for faas-netes with the Function CRD.
Long story short, there are a number of benefits of migrating to the Function CRD:
kubectl get function -n openfaas-fn -o yaml > functions.yaml
Then, the final benefit is that you can take advantage of the kubectl
CLI to explore functions, in addition to faas-cli, with kubectl get/edit/describe/delete function
.
If you’re performing a brand new installation, then just set the following in the Helm chart to enable the CRD and the Operator:
openfaasPro: true
operator: true
That’s it. From there onwards, you can deploy functions via the REST API or using kubectl.
For instance:
$ faas-cli generate --from-store nodeinfo
---
apiVersion: openfaas.com/v1
kind: Function
metadata:
name: nodeinfo
namespace: openfaas-fn
spec:
name: nodeinfo
image: ghcr.io/openfaas/nodeinfo:latest
labels: {}
annotations: {}
You can save this into a file and apply it with kubectl apply -f nodeinfo.yaml
.
You can still deploy functions using faas-cli up
or faas-cli deploy
, but you’ll get the benefits of the Operator that we talked about above on top.
Prior to our latest release, you’d have had to use our migration tool to back up all the deployed functions into the Function CRD format, upgraded the OpenFaaS installation, then deleted the old functions and re-created them using the backup file.
Fortunately with our latest change, a one-time migration is performed automatically if you’re running in the Operator mode.
Finally, it writes a ConfigMap into the openfaas namespace to prevent the operation from running again.
So whether you’re an existing user coming up from the Community Edition (CE), or have been running OpenFaaS Standard or Enterprise for a while, the only thing you need in your Helm chart is:
openfaasPro: true
operator: true
Then update the installation just as you always would with helm upgrade --install
.
Where we started off, customers had to delete functions when upgrading to the Operator, and then deploy them again, then we built a backup tool, and now we’ve gone one step further to improve the developer experience with an automated migration, built-into the faas-netes code. There’s nothing for you to do - it just kicks in when you turn on the Operator and does the most obvious thing.
So with this latest improvement, we’d like to see all customers moved onto the Function CRD, for the features and benefits it provides - both for us as maintainers and for you, as users.
We’d recommend running the migration on a backup or temporary cluster first, to make sure all your functions convert and come up as expected. This is what dev and staging are for after all, right?
You may also like:
]]>You may be familiar with the original UI that I built for OpenFaaS in 2017. It was focused on both invoking functions and deploying new ones from the Function Store. It used the original 1.x version of Angular.js and was built within the first year of the project, it served us well, but the underlying UI frameworks have moved on considerably since its inception.
The classic UI, which is part of faasd and OpenFaaS Community Edition (CE) is now in code-freeze which means it isn’t receiving changes. That’s generally not an issue for its intended audience of personal users and hobbyists. If you’re using OpenFaaS CE at work, it may be time to check out what kind of value we’ve been building over the past few years.
I think it was a key part of the adoption and developer love that we saw around the project. People like to see things, but many developers who are attracted to the kind of back-end coding that FaaS frameworks, or Kubernetes controllers bring have an aversion to - or lack of skill in front-end development.
So that’s where we come in. As the stewards, and full-time team behind OpenFaaS. We decided to release a new dashboard, with a fresh approach, fresh UI framework (React) and a different approach.
The new dashboard is geared around running OpenFaaS in production - so it’s for commercial teams who want more visibility and control over their functions.
OpenFaaS has its own REST API and Grafana dashboards, there’s also lots of monitoring options for Kubernetes itself, so what value can we add?
In a word - consolidation - bringing together the most important things you need within one UI for proactive use. For passive monitoring of throughput, scaling, errors and the duration of functions, we’d recommend using the Grafana dashboards we supply to customers.
Let’s take a quick tour of the new features, why we think you’ll find them useful, and how you can try each of them out.
We built a brand new experience for invoking functions using a code editor to format the input and output, and a dedicated tab for the response headers. This is a big improvement over the old UI which had a single text box for input and output.
Here’s the view showing the response headers:
You can also supply your own list of headers for the invoke, the method such as GET or POST, an additional path or a query-string. All of this is new.
One of the biggest concerns I’ve had is watching commercial teams sharing a single password for their whole OpenFaaS installation. Now we’ve offered SSO with various OpenID Connect (OIDC) providers for several years, but with our new IAM for OpenFaaS feature it’s really well integrated.
Here’s an example of the redirect to Keycloak, a popular open source project hosted by the Cloud Native Computing Foundation (CNCF):
Logging in with Keycloak, which can be federated to OIDC, LDAP and SAML providers.
When combined with the new IAM feature, you can also restrict access to read-only roles, or even to specific namespaces, or remove access all together from other company employees outside of your group.
SSO is fine-grained, so not everyone with a company email can just log in an manage your functions.
Authorized users will gain access to their own namespaces, which is a useful way to do multi-tenancy or just to organise internal teams with OpenFaaS.
Above: You can use a single Kubernetes cluster for multiple stages of our application like production and staging, or for multiple tenants.
Learn more:
For those of you who don’t use an Identity Provider (IdP) in your organisation, we’ve gone one better over the previous Basic Authentication approach.
A login form is used with TLS encryption instead of more rudimentary browser-based Basic Authentication.
Your password can now be remembered by a password manager or the browser itself, which makes it easy to manage multiple environments like dev, staging and production.
We know you don’t want to deploy functions through the UI in production, so we simply don’t offer it. Instead, you can use Helm, ArgoCD, Flux, kubectl along with the Function CRD, the CLI via a CI/CD job, or even the REST API to deploy functions.
With the popularity of Infrastructure as Code (IaaC) and GitOps, we are sure that 90% of you will be releasing code from a git repository, with an associated SHA and URL.
With the growing understanding of the dangers of Common Vulnerabilities and Exposures (CVEs) in containers, it’s important to know when a function was last deployed.
Above: Git SHA, link to the repository, and the date of the last deployment.
You can now hot link directly to a code diff to see what changed in production, if a function is behaving unexpectedly.
The direct hot-link into GitHub or GitLab will show you the precise change that has made its way into production.
Add the metadata by clicking “Set metadata” on the details page, or by adding the labels and annotations specified in the documentation.
The old dashboard had a list of functions, where you’d need to click on each one to find out what was a very limited set of data.
Now, the new dashboard shows: replica count, RAM, CPU, 1hr and 24 hr success vs. error rates, and metadata from CI/CD.
A much richer overview of what you need to know about a function at a glance.
You can now use the new dashboard to view the logs of a function without needing kubectl or faas-cli installed on your machine. That means you can use your iPad, phone, or a more restrictive environment to debug a problem with a function too.
I wrote a very simply function in Go using the golang-middleware
template to show you how it works.
package function
import (
"fmt"
"io"
"log"
"net/http"
"strings"
)
func Handle(w http.ResponseWriter, r *http.Request) {
var input []byte
if r.Body != nil {
defer r.Body.Close()
body, _ := io.ReadAll(r.Body)
input = body
}
headers := ""
for k, v := range r.Header {
headers += fmt.Sprintf("%s=%s, ", k, v)
}
headers = strings.TrimRight(headers, ", ")
log.Printf("Input headers: %v", headers)
w.WriteHeader(http.StatusOK)
w.Write([]byte(fmt.Sprintf("Body: %s", string(input))))
}
The headers from the request will be logged to stderr and shown on the “Logs” page.
The headers being inputted into the invocation
The resulting logs in the dashboard:
The headers printed into the logs
If you’ve used an earlier version of this page, then you may have noticed that we’ve added a new drop-down where you can pick the age of the logs to show going back up to one day.
That was a quick tour of the new OpenFaaS dashboard designed for use in production and in commercial settings. You can try it out with OpenFaaS Standard and OpenFaaS for Enterprises by enabling the dashboard in the Helm chart.
Coming up next, we’re looking at combining some of the recommendations from the OpenFaaS config-checker tool with the dashboard to show you how to get the most value out of the platform.
The chances are that if you’re running in production, you may also benefit from: multiple namespaces, fine-grained permissions, parallelism with JetStream for OpenFaaS, Scale to Zero, the Kafka event connector, and our set of Grafana dashboards for observability.
We find that the OpenFaaS Dashboard is useful for immediate feedback, and the Grafana Dashboards provide us with more proactive monitoring.
Understand if a function has a memory leak, or is consuming excessive CPU, how many replicas are running, how long requests take to process, and how many errors are being generated.
In one recent case, a customer was going to promote a Go function to production, the dashboard showed him a memory leak which he was unaware of - that swiftly got fixed before it cause any potential outage.
In a second case, we noticed on a support call that a function was using 6 vCPU at idle - not just requesting, but actually consuming that amount. The customer was completely unaware, and the UI dashboard helped them to identify the problem.
If you’d like to try out the dashboard for your team, or want to talk to us, get in touch here.
]]>