Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Service NodePorts on low ports #9995

Closed
thockin opened this issue Jun 18, 2015 · 47 comments
Closed

Allow Service NodePorts on low ports #9995

thockin opened this issue Jun 18, 2015 · 47 comments
Labels
kind/design Categorizes issue or PR as related to design. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@thockin
Copy link
Member

thockin commented Jun 18, 2015

To capture some of the discussion.

Gripe: Service NodePorts can only be in a "special" port range (apiserver flag). This is annoying for people who want to expose a service on port 80 (for example) but instead have to use something like 30986.

Rationale for this behavior:

  1. We don't want service node ports to tromp on real ports used by the node (e.g. 22).
  2. We don't want service node ports to tromp on pod host ports.
  3. We don't want to randomly allocate someone port 80 or 443 or 22.

Proposed compromise: Allow NodePorts on low ports with a bunch of caveats.

To address rationale points (1) and (2): Make kube-proxy open (and hold open) ports that it is using. This will prevent it from using port 22 (for example) for a service node port.

To address rationale point (3): Use the flag-configured range for random allocations, but allow users to request any port they want.

Caveats and analysis:

  1. Error reporting is not good. We do not know about what non-container stuff is using ports on the host and we do not have an easy way for the API server to allocate ports between pods and services. Not easy to resolve. The implication of this is that port-in-use errors will only be detected by kube-proxy when it tries to open the port on the node. Sending events for this is possible, but not great (one event from every node) and no other node-level error gets a kube-proxy event. Net result: user asked for a service on node port 22, and it just doesn't work and they have no idea why.

  2. Doing dual-ranges (use the flag range for allocations but allow the whole range to be used) is non-trivial and has to be plumbed through more code than I am comfortable with at this point. The implication of not doing this is that sometimes people will get allocated a port that happens to be 22 and can never work. Combined with caveat (1) this is really unpleasant. We could do some REALLY hacky things like just retry the random allocation if it is not in the flagged range. This avoids plumbing the dual-range logic down, but is embarassingly ugly. I am ashamed to have suggested it.

  3. Holding ports open is pretty easy and we should do this anyway. The error reporting properties are still not good.

Summary: I am unconvinced that this is worthwhile to rush into v1. We still have the option of baking a container that receives traffic on a HostPort and redirects it to a service. I have a demo of using socat to do this (simulating a pod manually):

root@kubernetes-minion-nnx6:/home/thockin# docker run -d --dns=10.0.0.10 --dns-search=default.svc.cluster.local -p 8080:93 thockin/hostport-to-service 93 hostnames2:80
1dcc1e94c30834290ae243ac298c6699b2a3348fc014b4b77ae34c13ead44854

root@kubernetes-minion-nnx6:/home/thockin# curl localhost:8080
hostnames2-gxt4f

@brendandburns @justinsb @eparis

@saad-ali saad-ali added kind/design Categorizes issue or PR as related to design. priority/backlog Higher priority than priority/awaiting-more-evidence. team/cluster labels Jun 18, 2015
@thockin
Copy link
Member Author

thockin commented Jun 18, 2015

Priority to be decided - some argue that it is P0.

@ghost
Copy link

ghost commented Jun 18, 2015

Thanks for summarizing the discussion so succintly @thockin.

Unless I'm misunderstanding the proposal, it involves unpredictable and difficult to diagnose failure of services, which seems like a complete non-starter to me. Or at the very least, vastly more undesirable than the annoyance of not being able to expose a service on a favorite pet host port number?

Am I missing something?

@thockin
Copy link
Member Author

thockin commented Jun 18, 2015

I hope my summary wasn't TOO slanted. There are legitimate issues that
this sort of change would fix.
On Jun 17, 2015 6:16 PM, "Quinton Hoole" notifications@github.com wrote:

Thanks for summarizing the discussion so succintly @thockin
https://github.com/thockin.

Unless I'm misunderstanding the proposal, it involves unpredictable and
difficult to diagnose failure of services, which seems like a complete
non-starter to me. Or at the very least, vastly more undesirable than the
annoyance of not being able to expose a service on a favorite pet host port
number?

Am I missing something?


Reply to this email directly or view it on GitHub
#9995 (comment)
.

@ghost
Copy link

ghost commented Jun 18, 2015

Don't worry, I mentally filtered out your propaganda :-)

But do the legitimate issues (which don't seem to be described in any detail) outweigh the "unpredictable and difficult to diagnose failures"? It's hard to believe that they would, which is why I ask whether I'm missing something.

@ghost
Copy link

ghost commented Jun 18, 2015

PS: I am familiar with the "I have an upstream application/network device which connects/forwards to port x and it's difficult to reconfigure" use case. Is that the legitimate issue? Are there others?

@thockin
Copy link
Member Author

thockin commented Jun 18, 2015

That is one of the legitimate use cases, though no user is claiming this
one is a hard requirement. The biggest issue is simply friction and "kick
the tires" operation.
On Jun 17, 2015 7:23 PM, "Quinton Hoole" notifications@github.com wrote:

PS: I am familiar with the "I have an upstream application/network device
which connects/forwards to port x and it's difficult to reconfigure" use
case. Is that the legitimate issue? Are there others?


Reply to this email directly or view it on GitHub
#9995 (comment)
.

@ghost
Copy link

ghost commented Jun 18, 2015

In that case I'm going to argue (quite strongly :-)) that allowing service nodeports on low port numbers should be optional, and turned off by default.

@eparis
Copy link
Contributor

eparis commented Jun 18, 2015

A great example would be the L7 ingress router used by openshift. It needs to run on 80 and 443. Quite reasonable and not at all a 'kick-the-tires' kind of thing.

While hostPort might be possible it does mean that I can't use rc's any more, as I have to start pinning the ingress router container to specific nodes. With publicIPs this was solved, as externally people could be told to use publicIP[0]:443 and everything would be ok as the ingress router could still be managed by the cluster.

(actually for some cert checking reasons we've had to bastardize the ingress router, but it worked really well on 80)

@ghost
Copy link

ghost commented Jun 18, 2015

@eparis Yes, your use case is the "PS" one I referred to above. It's not clear to me how you get around points 1 and 3 in the caveats and analysis section of the original description above (a.k.a. "user asked for a service on node port x, and it sometimes just doesn't work and they have no idea why")?

@thockin
Copy link
Member Author

thockin commented Jun 19, 2015

@eparis I don't think I understand your model. You run L7-as-a-service and you tell people to connect to $node_ip on :80 and :443, rather than a load balancer? And you're confident that no user of the cluster will ever ask for a Service with port 80 or 443 and ask for NodePort of the same port?

If so, you could change the apiserver's --service-node-port-range to "1-32767" and install your L7 before anyone "randomly" claims :80 or :443.

To reiterate, we WANT to enable this, but there's not time to do it properly before 1.0 - if you're comfortable with the caveats above, you can fix this today. My argument is that it should not default to "ridiculously brittle" mode and that the changes to make it not-brittle are too much churn for this point in the release cycle.

@eparis
Copy link
Contributor

eparis commented Jun 19, 2015

--service-node-port-range=1-32767 suffers extremely from accidentally stomping on port 22, not someone screwed up, the SYSTEM screwed up....

@eparis
Copy link
Contributor

eparis commented Jun 19, 2015

@thockin some people do have an external load balancer to terminate ssl and load balance to pods, which operates with access to the overlay network. Those people have better scalability for sure. But we provide a simple 'load balancer', running on ${some_node_ips} :80 :443 (nodePort would mean $all_node_ips).

We have no controller for nodePort yet, but yes, the plan is to outright reject a service if a 'normal' user specified anything other than 0.

To be honest, it does make me wonder what we do today if someone asks for hostPort=22. I think we just outright reject hostPort, but I don't know for sure...

@thockin
Copy link
Member Author

thockin commented Jun 19, 2015

If someone asks for HostPort=22 today it will fail to run on kubelet. You might pre-filter, but we don't.

As for stomping, it also will not work - we won't be able to claim the port. It's a bad situation wrt errors, but there's no way we will fix that in the next few weeks - I think the risk is too high for that change. I'll have a think if there's a less risky way. We could pre-allocate 22 I guess, but that's pretty pukey and specific.

@statik
Copy link

statik commented Jun 23, 2015

I would like to make the case that this is an important use case for a real customer.

I am trying to deploy a custom DNS service inside GKE. The DNS service gathers security metrics on the requests, so I really do want to be running this myself. So, I need to get TCP and UDP traffic on port 53 over to my pod.

The first thing I ran into was #2849 where the external load balancer does not support UDP.

Then I discovered type: NodePort, and am trying to figure out how to use this while manually configuring a GCE Network Load Balancer to send traffic into the GKE cluster. As far as I can tell, there is no way in GCE to create forwarding rules that change the port. So, I think the only option left is for me to specify a NodePort of 53 for my custom DNS service.

@justinsb
Copy link
Member

@thockin is there any real reason we don't support UDP in the load balancer? We've changed the method signature in the cloudprovider so we pass ports as []*api.ServicePort now (instead of []int); and it looks like we're actually listening on the UDP port in proxy anyway. @statik's use case seems real, but driven by a lack of UDP load balancing.

@statik
Copy link

statik commented Jun 25, 2015

For sure I would prefer #2489. Also, I think being able to configure a GCE Network Load Balancer to send traffic to a higher port would be totally fine, after thinking about this overnight I think either/both of those things are the actual problem, and low ports in NodePorts would not be needed by me if I could specify the destination port in the manually configured Google Network Load Balancer (as I can with an Amazon ELB).

@thockin
Copy link
Member Author

thockin commented Jun 26, 2015

The "rightest" answer is to add UDP support to load-balancer, but not just before 1.0

Unfortunately GCE's load-balancer is a forwarder, and does not allow port-remapping.

Here's a hack that can maybe tide you over: #10405 - please try it and let me know if it doesn't work.

@gmile
Copy link
Contributor

gmile commented May 22, 2018

Has anything changed regarding the ability to bind service to a privileged port?

Right now we have to work around this weirdly but running up an nginx alongiside kubelet, to forward from port 80 to 32167 (32167 is assigned randomly by NodePort Service).

If only kubernetes allowed mapping NodePort Service onto a particular port on Node, this would save us one indirection.

@thejmazz
Copy link

thejmazz commented Jul 7, 2018

@gmile have you tried the socat -> service (proxy-to-service) approach? i.e. swap your NodePort for a ClusterIP and have a DaemonSet running socat's that proxy to it (you may need to enable hostPort plugin for your CNI), or Pods, if you use DS it will behave similar to binding ports in Swarm's overlay

@cjbottaro
Copy link

I have a web server with a static ip address, which is whitelisted by our clients. Because of this, I don't think I can use an ingress resource because I'm running in AWS and they spin up ELB/ALB's with random and changing ip addresses.

So what I'm doing is using taints and labels to make my web server pod land on the node with the static ip address, but I can't get it to listen on the node's port 80.

Any suggestions? Thanks.

@thockin
Copy link
Member Author

thockin commented Aug 13, 2018 via email

@cjbottaro
Copy link

@thockin That didn't work until I installed this:
aws/amazon-vpc-cni-k8s#153

Apparently hostPort didn't work in CNI's until recently and amazon-vpc-cni was (is) lagging behind on that functionality.

@Nuxij
Copy link

Nuxij commented Jan 19, 2019

Usecase: IRCd (6667) [somewhere] on [the cluster].

  • Why can't I just create an Ingress resource for that port? Oh they only do HTTP for some reason.
  • I refuse to use a LoadBalancer type anything, as it costs extra money (requires cloud provider LB like AWS ELB).
  • Maybe a NodePort but I can only bind some whack high ports.
  • HostPort? DaemonSet? No scaling or all the scaling.

Just let me bind any port except the "system reserved" ones, is that strange? I don't understand why Kubernetes blocks us on this.

WTF Kubernetes.

@ghost
Copy link

ghost commented Jan 23, 2019

it would be really nice to use 80 or 443 in nodeport

@tomekit
Copy link

tomekit commented Jan 29, 2019

I am with @joeasaurus
I would really like to aboid using LoadBalancer type, as it adds additional complexity manging long running HTTP connections. I've got legacy API's which need HTTP connection maintained for up to 12 hours.
Pretty easy to achieve with stable network, direct communication without LoadBalancer.
E.g. AWS has the 4000s of idle timeout for their load balancer: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/application-load-balancers.html#connection-idle-timeout
Could not find the exact official value for Google LB, but it looks like it's 10 minutes:
https://stackoverflow.com/questions/39603274/how-long-will-google-cloud-https-load-balancers-keep-tcp-connections-open

@wzrdtales
Copy link

What is with ssh and other low ports? that should be supported. is this configurable by now?

@desmap
Copy link

desmap commented Mar 4, 2019

Just let me bind any port except the "system reserved" ones, is that strange? I don't understand why Kubernetes blocks us on this.

@quinton-hoole, @thockin, @eparis is there an update on this? @joeasaurus made quite good points. Services need to be able to expose all ports except 22 without any workarounds which don't autoscale.

Since this issue is also quite old, I kindly ask to reopen it and find together a decision for or against opening all ports (except 22). Leaving this on P0/backlog and hence, making no decision is a waste of time of all participants.

@llarsson
Copy link

It seems to me that this Istio Ingress Gateway issue should be solvable out of the box if its NodePort Service could just bind to ports 80 and 443, for cases when you cannot use a cloud load balancer (such as on-prem clusters). Because then I should be able to point all my relevant DNS entries to my nodes and thus browsers would send the Host headers that Istio's Ingress Gateway Service expects.

So with that said, I too would like to have this functionality. I know that one can use the host's network to achieve this effect, but a cleaner way of doing this for e.g. Kubernetes Ingress Controllers or, for that matter, Istio's Ingress Gateway would be preferable.

@bitsofinfo
Copy link

need this!

@minhtrietbkit
Copy link

We have a very specific case for using NodePort instead of LoadBalancer. Several endpoints in our application requires connection timeout to be really long (one of them being two hours). We are using AWS EKS (AWS native K8s) and the load balancers there doesn't allow timeout that long. This is the main reason why we have to use NodePort. If we can't expose NodePort service through port 80/443, it is a big hindrance to our deployment since users can't access the service through browsers.

@llarsson
Copy link

@minhtrietbkit (and others with this issue) until this bug is fixed, which I hope will happen but am not holding my breath, you should be able to work around this limitation by deploying your such services using a DaemonSet with privileged Pods that can use the host network. If you wish to only use a subset of your nodes for a particular service, you could taint the nodes and use the taints/tolerations mechanism to control scheduling. To communicate with the DaemonSet from within the cluster in a clean way, you can use any of the documented alternatives.

Admittedly, this is a rather hacky work-around that needs to be documented referencing this issue, but it would at least let you overcome the hindrance in the short term.

Since all NodePort services have to open the same port on all nodes, the DaemonSet approach I just outlined would actually be cleaner in some sense than actual NodePort support. Because at least with the DaemonSet, I can partition my set of nodes to support more than one service owning port 80/443 forever. Unless that one service is an Ingress Controller or similar, that would probably not be desirable in the general case.

@matti
Copy link

matti commented May 18, 2019

Many people in this thread would benefit from https://github.com/kontena/akrobateo/blob/master/README.md

@evgnomon
Copy link

evgnomon commented Apr 19, 2020

You might want to do this to expose low range to public,

apiVersion: v1
kind: Service
metadata:
  name: forwarder
  namespace: istio-system
spec:
  selector:
    app: istio-ingressgateway
    istio: ingressgateway
  sessionAffinity: None
  type: NodePort
  ports:
    - name: http
      protocol: TCP
      port: 80
    - name: https
      protocol: TCP
      port: 443
  externalIPs:
    - 88.198.XX.XX
    - 95.216.YY.YY

@rektide
Copy link

rektide commented Sep 26, 2020

Unless I'm misunderstanding the proposal, it involves unpredictable and difficult to diagnose failure of services, which seems like a complete non-starter to me.

This seems to be the only resistance I've heard to this & I don't understand the complaint. It wasn't elaborated or explained.

This issue seems like a straightforward way to host a range of services easily that currently are a huge pain to deal with (like http) while maintaining current functionality for most. This should be re-opened. I don't understand why hosting HTTP or DNS is so very very difficult for small operators right now.

@mamiu
Copy link

mamiu commented Oct 3, 2020

@liggitt @wojtek-t @dims @justaugustus (Sorry for mentioning you all, I just randomly picked still active contributors.)

  1. Who is able to reopen this issue?
  2. If this is not intended, who is able to give us a solid explanation why this won't be changed/"fixed"? (As requested in the comment above)

If you're not responsible for this part of k8s, but know who is, please include that person in this conversation.
Thanks.

@liggitt
Copy link
Member

liggitt commented Oct 3, 2020

@thockin can reopen it if desired and has good context on why it would be a good or bad idea to do so. Adding this topic to the next sig-network agenda would be a good way to discuss it

@mamiu
Copy link

mamiu commented Oct 3, 2020

@liggitt Thanks for your quick response!

Can you include it in the sig-network agenda (or if I am able to do that too, how can I add this topic to it)?

@liggitt
Copy link
Member

liggitt commented Oct 3, 2020

The time and agenda info is at https://github.com/kubernetes/community/tree/master/sig-network

Joining the kubernetes-dev or kubernetes-sig-network mailing lists should get you access to those

@mamiu
Copy link

mamiu commented Oct 3, 2020

Thanks, I'll do that.

Edit:

It's now on the agenda for the next SIG Network Meeting. I've also sent an email to the mailing list.

@denizdogan
Copy link

Was it ever discussed by the SIG? I'm assuming no, based on the note from October 15 that says:

(deferred - suggesting party not here) Suggested topic by Manu Miu (manu.miu@gmail.com): Discussion of the Github issue “Allow Service NodePorts on low ports” as described in this mailing list email.

@fuomag9
Copy link

fuomag9 commented Mar 2, 2021

Is there any update on this? I find it very illogical not to be able to open ports different from port 22!

@theAkito
Copy link

theAkito commented Mar 2, 2021

Is there any update on this? I find it very illogical not to be able to open ports different from port 22!

I think it is important to stay fair to all parties in a controversial discussion, which is why I would like you to clarify what you are talking about. This has nothing to do with Port 22 (default SSH) port, as in the whole point of the special NodePort range is that ports such as the one for SSH do not get allocated.

All that said, it is best practice to change the default SSH port to something other than 22, anyway.

@fuomag9
Copy link

fuomag9 commented Mar 2, 2021

Is there any update on this? I find it very illogical not to be able to open ports different from port 22!

I think it is important to stay fair to all parties in a controversial discussion, which is why I would like you to clarify what you are talking about. This has nothing to do with Port 22 (default SSH) port, as in the whole point of the special NodePort range is that ports such as the one for SSH do not get allocated.

All that said, it is best practice to change the default SSH port to something other than 22, anyway.

I probably bad worded it, I meant it in the "don't conflict with existing stuff" but still allow a lower number in the range of ports to be allocated by NodePort

@BenTheElder BenTheElder added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Mar 2, 2021
@BenTheElder
Copy link
Member

I just want to point out that this is actually up to your cluster operator, the node port range has long been configurable
with the --service-node-port-range option for kube-apiserver. https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/

Also this old comment:

If only kubernetes allowed mapping NodePort Service onto a particular port on Node, this would save us one indirection.

Unless I've completely misunderstood the intent here ... You can absolutely do that. It still has to be within the configured range though. Set the nodePort field on the ports entry.
https://kubernetes.io/docs/concepts/services-networking/service/#nodeport


SIG Network still owns this. Any community member has the ability to reopen, but personally I would still defer to the network special interest group. I've sent a poke to them.

@aojea
Copy link
Member

aojea commented Mar 3, 2021

This was answered in the mailing list too
https://groups.google.com/g/kubernetes-sig-network/c/Ihq_1AslEMQ/m/oFiOxOJcAQAJ

I think that this will be reopened as a bug if --service-node-port-range can not select a different nodeport range or, if it can, those doesn't work

@wzrdtales
Copy link

@BenTheElder also wondering why this pops up in my mails again and again your interpretation is right, that is what this issue was about. changing the node port range is supported. I guess people that hit on that problem find in google instead of the documentation this issue first.

@thockin
Copy link
Member Author

thockin commented Mar 5, 2021

If you set the service-node-port-range to include 22, then that means 22 might be randomly allocated to any Service, even if you have SSH running and want to keep it running. BAD.

One option discussed a long time back was to allow low ports to be EXPLICITLY requested but not allocated. But then what governs who may request low ports? And how do you block off the ports that you don't want anyone to use?

TL;DR: this seems like a lot of work and design for something that isn't really that common. You can use a pod hostPort to claim a low port, but only on a single node. That's better, but has many of the same problems - governance, collisions, exclusions.

NodePorts were designed to build LBs on top of. They are not supposed to be very friendly to humans because humans are not the target audience,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests