The metastatic driver now includes the providers.[metastatic].pools.host-key-checking and providers.[metastatic].pools.labels.host-key-checking options. This can be used to verify that a backing node is still functioning correctly before allocating new metastatic nodes to it.
It is now possible to provide an custom pod definition for the Kubernetes and OpenShift drivers using the providers.[kubernetes].pools.labels.spec, providers.[openshift].pools.labels.spec and providers.[openshiftpods].pools.labels.spec attributes. These can be used to supply parameters not otherwise supported by Nodepool, or to create complex pods with multiple containers.
The default providers.[aws].diskimages.volume-type for AWS diskimages has been changed from gp2 to gp3.
The hostname-format option, previously available for the OpenStack and AWS drivers, but unused for some time, has been removed. Remove any uses of it from nodepool config files before upgrading; otherwise they will fail config validation.
The AWS driver now supports an providers.[aws].image-import-timeout option to control automatic retries and timeouts when AWS import task resource limits are reached.
The AWS driver now supports volume quota. It will automatically register the limits from the cloud and ensure that labels that specify EBS volume attributes stay under the limit.
The Azure driver now supports specifying the size of the OS disk.
The Kubernetes and OpenShift drivers now support adding dynamic metadata, i.e. Pod and Namespace labels, with information about the corresponding node request. This is analogous to the existing dynamic tags of the OpenStack, AWS, and Azure drivers.
The Azure driver now support specifying community and shared gallery images.
The metastatic driver will now automatically use the node-attributes from backing nodes as default values for node-attributes of metastatic nodes. Any node-attribute values specified in the metastatic pool config will override those from the backing node.
An upload timeout can be configured on OpenStack providers for use in the case that image uploads to a provider take longer than the default of one hour.
The Azure driver now uses the “Standard” SKU for all public IP addresses. Previously it would chose either “Standard” or “Basic” depending on the selection of IPv4 or IPv6 addresses.
Pricing for public IP addresses may differ between the SKU levels.
Standard IP addresses block all incoming traffic by default, therefore the use of a Network Security Group is required in order to allow incoming traffic.
If you are not currently using a Network Security Group, then before upgrading Nodepool it is recommended to create one, add any required rules, and attach it to the subnet that Nodepool uses.
The Azure driver no longer creates and deletes Disks, Network Interfaces, or Public IP Addresses as separate steps.
Nodepool and user-supplied tags will no longer be applied to Network Interfaces, or Public IP Addresses. This also limits Nodepool’s ability to detect leaks of these resources (however this is unlikely since Azure is now responsible for deleting them).
The AWS driver now supports importing images using either the “image” or “snapshot” import methods. The “snapshot” method is the current behavior and remains the default and is the fastest and most efficient in most circumstances. The “image” method is available for images which require certain AWS licensing metadata that can only be added via that method.
Many more drivers now report metrics for leaked resources.
The OpenStack driver now supports volume quota. It will automatically register the limits from the cloud and ensure that labels that utilize boot-from-volume stay under the limit. Limits can also be specified at the pool and tenant level in Nodepool’s configuration.
Add support for requesting gpu resources in kubernetes and openshift drivers.
Added support for specifying Kubernetes and OpenShift pod resource limits separately from requests.
Added support for specifying the scheduler name, additional metadata, and volume mounts in Kubernetes and OpenShift drivers.
New diskimage build ids are now in UUID format instead of integer sequences with leading zeroes. This facilitates faster restoration of images from backup data on systems with large numbers of image builds. Existing builds will continue to use the integer format and can coexist with the new format.
Python 3.11 is now the only version of Python with which Nodepool is tested.
Nodepool now requires ZooKeeper version 3.6.0 or later (note that at the time of this writing, the oldest supported ZooKeeper version is later than that).
In the OpenStack driver, when using
`min-ram`in combination with a
`flavor-name`, the first flavor to be found that satisfied the
`min-ram`requirements and contained the substring
flavor-name`would be used. The order of the flavors to be searched was dependent on what the cloud returned. From this release, the available flavors are now alphabetically sorted before matching
The nodepool command now includes a “hold” subcommand.
The Kubernetes driver now supports version 1.24+
The AWS driver now support specifying volume IOPS and throughput; see: providers.[aws].pools.labels.iops, providers.[aws].pools.labels.throughput, providers.[aws].diskimages.iops, and providers.[aws].diskimages.throughput.
The OpenStack, AWS, and Azure drivers now support adding tags (AKA metadata, AKA properties) to instances with dynamic data about the corresponding node request using string formatting.
The AWS and Azure drivers now support adding tags to images via the providers.[aws].diskimages.tags, and providers.[azure].diskimages.tags attributes. The OpenStack driver already supported similar behavior with its meta attribute.
The diskimages.metadata attribute has been added and will supply default values to the provider image tag attributes mentioned above.
A new configuration option for K8s Pod type labels was added to limit the amount of ephemeral storage allocatable in a container (cf. K8s Local ephemeral storage resource documentation <https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#local-ephemeral-storage`__) This limit can be set via the integer value of
and is treated as Megabytes. Also, a pool-scoped default value can be specified via
The AWS driver now supports multiple quotas for specific instance types. This support is automatic, but also includes corresponding enhancements to provider, pool, and tenant limits configured in Nodepool.
A new statsd metric,
nodepool.image_build_requestsis available. It reports the number of outstanding manual image build requests.
A new image-status command and accompanying web endpoint are available to easily see what images have been paused via the image-pause command and have pending manual build requests via the build-image command.
The AWS, Azure, and IBMVPC drivers now check provider quota before accepting requests. This allows them to decline requests which can not possibly be satisfied given provider quota constraints.
Config options for kubernetes providers were added to define default limits for cpu and memory for pod-type labels.
These values will apply to all pod-type labels within the same pool that do not override these limits. This allows to enforce resource limits on pod labels. It thereby enables to account for pool and tenant quotas in terms of cpu and memory consumption. New config options for kubernetes pools therefore also include
The exsisting tenant quota settings apply accordingly. Note that cpu and memory quotas can still not be considered for labels that do not specify any limits, i.e. neither a pool default, nor label specific limit is set.
Priority may now be set at the provider or provider-pool level. Providers or pools with the highest priority will fulfill requests first, until they are at quota, at which point lower-priority providers will begin to be used.
See providers.priority for details.
Due to an internal change in how Nodepool launchers communicate with each other, all launchers should be upgraded to the same version within a short period of time.
They will generally continue to work at different versions, but the mechanism that allows them to yield to specific providers when requested is being changed and so that will not function correctly unless they are upgraded near-simultaneously.
Fixes an exception that was raised by the OpenStack driver when attempting to reset the quota timestamp and ignore-provider-quota is true. This exception prevented nodepool from seeing quota errors from OpenStack, causing it to fail the node request instead of retrying as it does for other providers.
Removes diskimage.meta checks from the OpenStack driver. The limit of only 5 entries is anachronistic and now removed. Rather than trying to pre-guess what OpenStack wants the metadata is now passed as-is and OpenStack will reject it at upload time.
Previously, metadata was checked by nodepool and invalid values would cause all metadata to be silently ignored. Now, metadata will be passed directly to glance, and an API error will occur. This may mean that images that previously uploaded (with no metadata) will now cause an API error when uploading.
The AWS driver has been updated to achieve parity with other Nodepool drivers.
The AWS driver now supports rate limiting. It utilizes a two-tier rate limiting system to match AWS’s request token buckets. The rate specified in the config file is used as the rate for mutating requests. Non-mutating requests will have their rate limited to 10 times that amount.
The AWS driver now supports quota. AWS only provides a quota value for the number of cores.
The AWS driver now support diskimage uploads.
The AWS driver uses a new state machine framework within Nodepool with significant caching in order to improve performance at scale.
The AWS driver now supports IPv6 addresses.
openshift and openshiftpods drivers now supports pods using images from private registries by configuring image-pull-secrets.
It is now possible to configure the zookeeper-timeout. which can help to avoid zookeeper session losses on busy systems.
The AWS driver will now ignore the “Name” tag if specified. Instead, it matches the behavior of other Nodepool drivers and sets the instance name to the Nodepool hostname (which is derived from the node name; e.g, “np0000000001”)
Python 3.8 or newer is now required. This change was made for two reasaons. The IBM SDK requires Python 3.7 or newer and Zuul now requires 3.8 due to Ansible 5. Update Nodepool to match.
In AWS providers, the
public-ip-addresssetting is deprecated. Use
In AWS providers, specifying image filter values as non-string values is deprecated. The current behavior is that Nodepool coerces non-string values (such as
trueor integers) into strings, but a later version of Nodepool will produce an error. Please update config files to use literal (quoted if necessary) YAML strings.
Kubernetes 1.8 or newer is required by the Kubernetes driver. This was necessary to support Kubernetes 1.22.0 and newer which requires using APIs that are not supported before version 1.8.
This release is nearly identical to 4.4.0. The major version increment to 5.0 is to re-align with Zuul.
Added support for filtering Azure images. Use the providers.[azure].cloud-images.image-filter setting to specify a private image using filters (tags, for example).
The node-attributes setting has been added to the Azure driver.
The Azure driver now supports setting an admin password, which is required in order to launch Windows images on Azure.
The shell-type setting has been added to the Azure driver.
The Azure driver now supports user-data and custom-data.
Two new nodepool commands, nodepool export-image-data and nodepool import-image-data have been added to back up the image data in ZooKeeper to a file in case the ZooKeeper cluster is lost.
Added the option to set quota on resources on a per-tenant basis (i.e. Zuul tenants).
A new top-level config structure
tenant-resource-limitshas been added under which one can specify a number of tenants, each with
max-ramlimits. These limits are valid globally, i.e., for all providers. This differs from currently existing provider and pool quotas, which only are considered for nodes of the same provider. This feature is optional and tenant quotas are ignored for any NodeRequests that do not deliver tenant information with them. Also no quota is evaluated for tenants that have no limits configured for them.
Fixes a regression in gathering public SSH host keys on slower nodes. We now wait until SSHd on the remote node is started before gathering host keys.
Fix Kubernetes in-cluster configuration loading if no local config is present. The previous release missed a fallback case which has been corrected.
Nodepool 0.3.6 introduced an unintended behaviour change with
statsdreporting. Due to a change in the way nodepool manages OpenStack API calls, all API related statistics created during interaction with clouds are now generated by
openstacksdkand prefixed with
openstack.apiinstead of being created by nodepool and prefixed with
nodepool.provider.<cloud>as in prior versions. If you wish to revert to the prior behaviour, changes have been provided to
openstacksdkto allow setting custom prefixes via the cloud configuration file; see statsd documentation
Multi-line log messages (such as tracebacks from image builds) are now prefixed in the same manner as single-line messages.
AWS EC2, GCE, Kubernetes, Openshift, Openshift Pods, OpenStack and Static drivers now support a shell-type config. Shell-type config is intended to enable setting of cmd or powershell as shell type for Windows workers with connection-type ssh.
For Linux workers, there is a long standing ansible issue with using non-default shell type and become, so care should be taken if using it for such workers.
zuul-public-keyconfiguration attribute in the
providersAzure driver has been moved and renamed. Please move this setting to its new location at providers.[azure].cloud-images.key
TLS is now required for ZooKeeper connections. TLS support has been optional since version 3.13. If you have not already enabled it, we recommend enabling it before upgrading to 4.0.
Configuration value can be set from the envirnonment variables using the %(NODEPOOL_env_name) syntax.
Basic support for specifying k8s/OpenShift nodeSelectors on Pod node labels. This allows to schedule a Pod on k8s nodes with specific labels, e.g., having certain capabilities.
Support for passing environment variables to k8s and OpenShift Pod build nodes has been added.
It is not possible to set persistent env vars in containers on run time because there is no login shell available. Thus, we need to pass in any env vars during node launch. This allows to set, e.g., http_proxy variables. Environment variables can be defined on node labels as a list of dictionaries with name and value fields as per the k8s container YAML schema.
The k8s and OpenShift providers do not longer set the workingDir attribute of their container specs to /tmp.
For increased flexibility for the user, the working dir specified in the container images Dockerfile is used as the default in container nodes. Please note that this might often be the root dir (‘/’) if not specified otherwise by the respective Dockerfiles WORKDIR directive.
The dependency on kazoo has been upgraded to 2.8.0 which has an important fix for using Zookeeper over TLS.
The docker images published to
zuul/nodepool-builderare now built as multi-arch images and support arm64 in addition to amd64.
Entries in the diskimages section can now specify a parent image to inherit configuration values from. You can also specify images for configuration use only as abstract to consolidate common values.
-1will disable the removal of old build logs.
Support for encrypted connections to ZooKeeper has been added.
Before enabling, ensure that both Zuul and Nodepool software versions support encrypted connections. See the Zuul release notes, documentation, and associated helper scripts for more information.
Both Zuul and Nodepool may need to be restarted together with the new configuration.
nodepool.provider.<provider>.downPortshas been renamed to
Zookeeper hosts specified as IPv6 literals will now be configured correctly.
Support for resources in Google Compute Engine (GCE) has been added.
Add optional ebs-optimized on ec2 instances.
Add optional tags on ec2 instances and use cloud-image label as Name.
It is now possible to specify if AWS nodes shall get a providers.[aws].pools.public-ip-address.
The AWS driver now supports custom providers.[aws].pools.labels.userdata when launching instances.
There is a new
GET /readyendpoint that can be used as a readiness probe.
Nodepool now supports providers.[openstack].post-upload-hook to run a user supplied script after an image has been uploaded to a cloud but before it gets used.
Fixed compatibility issue with openstacksdk 0.37.0 and above.
Fixed kubernetes driver service account creation issue resulting in zuul job to fail with: MODULE FAILURE: error: You must be logged in to the server (Unauthorized)
diskimagecan specify the full path to the diskimage-builder command with the
dib-cmdconfiguration parameter. The
nodepool-builder(only used by CI) has been removed and replaced with explicit calls in testing fixtures.
/usr/bin/python2). With this, Zuul 3.11.1 and greater will set the
autowhen using Ansible >=2.8 to use automated interpreter discovery. When using earlier Ansible, it will remain the old default of
This will remove the need to override python-path explicitly for Python 3-only distributions, which should be detected correctly automatically.
This release should only be run against Zuul 3.11.1 or greater. Earlier Zuul releases will not convert the new default
/usr/bin/python2for Ansible <2.8, leading to a configuration error. It may be possible to use earlier Zuul releases if you you are only using Ansible >= 2.8, or explicitly set
python-pathfor every image.
The Kubernetes driver now supports optionally loading cluster admin service account information from the standard in-cluster configuration paths if Nodepool itself is running in Kubernetes. If this method is used, installation of a
kube/configfile in the Nodepool launcher pod is no longer required.
Fix dependency issue with openshift python client, that would prevent nodepool-launcher from starting properly.
A new driver is available to support unprivileged Openshift cluster as a resources provider to enable pod creation within a developper project.
Provider labels for the OpenStack driver are now able to toggle providers.[openstack].pools.labels.host-key-checking. This overrides the host-key-checking value defined by providers.[openstack].pools.host-key-checking.
Provider labels for the OpenStack driver are now able to select which networks to be attached to. This overrides any networks defined by providers.[openstack].pools.networks.
The diskimage-builder stats have been reworked to be more useful. The return code and duration is now stored in
nodepool.dib_image-build.<diskimage_name>.status.<rc|duration>; previously this was split for each image format. This is unnecessary and confusing since the results will always be the same, since all formats are generated from the same diskimage-builder run. An additional gauge
nodepool.dib_image_build.<diskimage_name>.status.last_buildis added to make it easy to show relative time of builds in dashboards.
TaskManagerused by the OpenStack provider has been removed. The
openstacksdkhas grown support for rate limiting using a
FairSemaphoreinstead of a pool of worker threads. This should reduce the overall thread count.
statsd key names have changed. Because of the removal of
TaskManagerstatsd calls are being deferred to openstacksdk. Instead of keys of the form
ComputeGetServers, the openstacksdk keys are of the form
compute.GET.servers. They will always start with the normalized
service-type, followed by the HTTP verb, followed by a
.separated list of url segments. Any service version, project-id entries in the url or
.jsonsuffixes will be removed.
The new Amazon Web Services (AWS) EC2 Driver allows launching EC2 instances as nodes.
A new option (build-timeout) has been added to the builder diskimage configuration to control how long the builder should wait for image builds before giving up. The default is 8 hours.
A new driver is available to support Openshift cluster as a resources provider to enable project and pod request.
The AWS driver does not support quota management at this time.
The AWS driver does not support custom image building.
A new configuration option is available under the ‘pools’ attribute of an OpenStack provider. This config value, ‘node-attributes’, can contain a dictionary of arbitrary key-value pairs and will be stored with the node data within ZooKeeper.
A change to the ZooKeeper schema to support a new DELETED node state will require a total shutdown of all Nodepool launchers before restarting any of them with this version.
Fixes a regression of missing task statistics with OpenstackSDK versions greater than 0.19.0.
Added a new routine to the OpenStack driver cleanup resources phase that will remove any ports reported to be in the DOWN state. Ports will have to be seen as DOWN for at least three minutes before they will be removed. The number of ports removed will be reported to statsd.
The nodes by label and state statistic gauges are now correctly reset to zero if no node of a label and state exists.
Task names are now consistently normalised to CamelCase without deliminators. Some statistics sent to statsd with
_characters will have changed keys, for example
Two new metrics are now reported after each run of the diskimage builder: nodepool.builder.dib_image_build.<diskimage_name>.<ext>.rc will be set to the last result code of the diskimage builder. This metric can be used to set up alerting for failed disk image builds. nodepool.builder.dib_image_build.<diskimage_name>.<ext>.duration will receive the time it took to build the disk image.
The OpenStack driver now supports configuring instance properties on boot. These properties end up in the instance metadata and will be visible to the instance after boot. Use the
instance-propertiesdict on provider pool label to set this per label type booted.
The static driver now updates labels and connection related attributes in Zookeeper at startup and on config change. Changing the name of a node will be handled via the registration/deregistration flow as before.
Bump minimum version of openstacksdk library to 0.17.2 to correct an issue causing a crash in OpenStack provider communication threads.
A new boolean pool variable
ignore-provider-quotahas been added to allow the provider quota to be ignored for a pool. Instead, nodepool only checks against the configured max values for the pool and the current usage based on stored data. This may be useful in circumstances where the provider is incorrectly calculating quota.
The detailed nodepool list outputs the node’s pool.
Diskimages env-vars can be set in the secure.conf file.
A new node status (ABORTED) is added to the ZooKeeper data model. It is recommended that, during your nodepool upgrade, you shut down all launcher processes before restarting any of them. Running multiple launchers with mixed support of this new node status may cause unexpected errors to be reported in the logs.
For pre-existing cloud images (not managed by nodepool), referencing them by ID was failing since they could not be found with this data, only by name.
Nodepool now defaults to building qcow2 diskimages instead of failing if the diskimage doesn’t specify an image format and the diskimage isn’t used by any provider. This makes it more convenient to build images without uploading them to a cloud provider.
Added support for specifying security-groups for the nodes in openstack driver. Pool.security-groups takes list of SGs to attach to the server.
The static driver now pre-registers its nodes with ZooKeeper at startup and on configuration changes. A single node may be registered multiple times, based on the value of max-parallel-jobs.
Nodepool can now support multiple node labels, although the OpenStack and static node drivers do not yet support specifying multiple labels, so this is not yet a user-visible change. This does, however, require shutting down all launcher processes before restarting them. Running multiple launchers with mixed support of multi-label will cause errors, so a full shutdown is required.
Fixed a bug where if a request handler is paused and an exception is thrown within the handler, the handler was not properly unpaused and the request remained in the list of active handlers.
The connection port can now be configured in the provider diskimages section.
Added support for configuring windows static nodes. A static node can now define a
ssh-portoption has been renamed to
ssh-portin static node config is deprecated. Please update config to use