Supporting multiple API prefixes #10009

uluyol · 2015-06-18T07:24:54Z

Introduction

Continued from #7111 and #2306 (also relevant: #3806, #6363).

There is a strong desire for kubernetes to support multiple API prefixes. This would allow us to split to the API into a more logical structure and introduce support for an experimental API prefix. Additionally, we would like to separately version different parts of the API and provide sensible scoping rules for API objects.

OpenShift serves its own objects separately from kubernetes objects using the /osapi endpoint. However, it registers its objects under the kubernetes api.Scheme (instead of a separate runtime.Scheme) and is forced to follow the kubernetes version numbers. Moreover, OpenShift's "kinds" must be unique to avoid clashing with those in kubernetes. We would like a more general approach to avoid these limitations.

Proposed Hierarchy

To achieve this, the api could have the following hierarchy: provider/apigroup/version/kind with the following properties:

Any version/kind should be convertible to the corresponding version/kind within the same API group
API groups within the same provider should be able to refer to objects within other groups. This is needed for things like namespaces and lists, as well as experimental API groups.

In the existing API server, a runtime.Scheme somewhat approximates a provider/apigroup, while an apiserver.APIGroupVersion corresponds to a version.

Challenges

Much of the code assumes that only one prefix, runtime.Scheme, and related structures exist. This is especially problematic for things like namespaces.
Referring to types in other runtime.Schemes requires that they (along with conversion and defaulting functions) be registered with the current runtime.Scheme and version.
The master does core API-specific work (link)
Registering new prefixes/versions/objects requires dealing with too many things (see Our API code is hard to understand #2306 and Proposal: Long term evolution of runtime.Scheme / conversion.Scheme #7111)

uluyol · 2015-06-18T07:26:15Z

@lavalamp @bgrant0607 @smarterclayton @derekwaynecarr

bgrant0607 · 2015-06-19T01:34:15Z

The plugin issues are also somewhat relevant, so I'll ref them here: #991, #3201

bgrant0607 · 2015-06-19T01:41:17Z

Pointing out a minor detail: If we add more fields to the API schema, we'll want to populate them automatically so that the user doesn't have to, in the server (#3000) and/or client.

We probably could punt on provider for now, since there would only be different values for systems that didn't use Kubernetes conventions/tooling, such as Docker, cloud providers, etc.

bgrant0607 · 2015-06-19T01:49:46Z

Some hypothetical API groups, to test proposals against:

end-user node-related objects: pod, pod template, secret, persistent volume claim
end-user networking-related objects: service, route (future)
end-user deployment/workflow objects: replication controller, daemon (future), deployment (future), job (future)
app admin objects: namespace, limit range, resource quota, service account
system objects: event, endpoints, component status, resource stats (future)
infrastructure / cluster admin objects: node, persistent volume
experimental apis

smarterclayton · 2015-06-19T02:19:34Z

Might want to split services off - if there are other control objects and policies around them they could potentially be big enough to be distinct (ie Openshift routes, external IPs or load balancers, affinity policies, etc). That said, they're fairly self contained.

On Jun 18, 2015, at 9:50 PM, Brian Grant notifications@github.com wrote:

Some hypothetical API groups, to test proposals against:

end-user core objects: pod, pod template, secret, persistent volume claim, service
higher-level end-user objects: replication controller, daemon (future), deployment (future), job (future)
app admin objects: namespace, limit range, resource quota, service account
system objects: event, endpoints, component status, resource stats (future)
infrastructure / cluster admin objects: node, persistent volume
experimental apis
—
Reply to this email directly or view it on GitHub.

bgrant0607 · 2015-06-19T04:16:09Z

Re. services: Good point.

Also sort-of relevant: #1451, #1744, #2306

Worth looking at:
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/api/register.go

https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/runtime/scheme.go
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/runtime/scheme_test.go
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/runtime/embedded_test.go

https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/conversion/scheme.go
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/conversion/scheme_test.go

https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/kubectl/cmd/cmd_test.go

https://github.com/openshift/origin/blob/master/pkg/image/api/v1/register.go

https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/master/master.go#L723
https://github.com/openshift/origin/blob/master/pkg/cmd/server/origin/master.go#L583

What do we need per API group?

At least api.Scheme (which is a runtime.Scheme) and the corresponding conversion.Scheme.

m.storage (master.go).

bgrant0607 · 2015-06-19T05:04:33Z

This would benefit from #9971

bgrant0607 · 2015-06-19T05:09:03Z

From in-person discussion: All api groups will need to respect namespaces, for objects that should be namespaced.

smarterclayton · 2015-06-19T14:44:13Z

Another desired feature: in the future, we should be able to easily run subsets of the apiserver independently. That might cut across api groups, or only expose specific APIs (like the proxy and passthrough logging systems).

bgrant0607 · 2015-06-19T18:43:41Z

I agree, but don't know that we need to tackle subsets as part of this effort.

Another consideration: We need to continue to expose v1 as a single API group. We might need to decouple the effort to add new API groups from splitting the existing monolithic group. Or, maybe everything should go into the same Scheme, which would then have a sparse conversion matrix, so that the same entity could be exposed via multiple API groups. In the objects, we'd add apiGroup as a separate field (in TypeMeta), but in the version maps we'd prefix the version keys with their API groups. We have previously added objects (PodTemplate) that haven't exposed API paths in all API versions.

bgrant0607 · 2015-06-19T20:43:31Z

Candidates for experimental API prefix:

bgrant0607 · 2015-06-19T20:44:43Z

Those could also be plugins, I suppose.

bgrant0607 · 2015-06-19T21:51:01Z

Probably need to fix #1490

lavalamp · 2015-06-24T00:04:46Z

However, it registers its objects under the kubernetes api.Scheme (instead of a separate runtime.Scheme) and is forced to follow the kubernetes version numbers. Moreover, OpenShift's "kinds" must be unique to avoid clashing with those in kubernetes.

I don't understand this part. Scheme doesn't care if version matches across objects or not, and you can register a kind with whatever name you want, you don't have to use the name of the go object.

lavalamp · 2015-06-24T00:09:41Z

If we make multiple schemes, we'd need a scheme registry (for people to find the scheme for their object), which seems like an awfully tall meta stack.

So instead consider making scheme provider aware. I'm not convinced that group needs special treatment, that just seems like window dressing apiserver tacks on.

Another reason to make scheme provider aware is so that you can mount the same object in multiple prefixes and have our encode/decode/convert code understand that it is the same object and doesn't need conversion.

uluyol · 2015-06-24T00:59:56Z

However, it registers its objects under the kubernetes api.Scheme (instead of a separate runtime.Scheme) and is forced to follow the kubernetes version numbers. Moreover, OpenShift's "kinds" must be unique to avoid clashing with those in kubernetes.

I don't understand this part. Scheme doesn't care if version matches across objects or not, and you can register a kind with whatever name you want, you don't have to use the name of the go object.

OpenShift needs to refer to kubernetes objects (if nothing else, those in api.{Type,Object}Meta, and as things stand, this requires being in the same version as those objects. OpenShift could use a different version number, but then the same objects would have to be re-registered under that version along with conversion and copy functions.

smarterclayton · 2015-06-24T03:16:20Z

That's definitely the biggest pain point today. I'm provisionally ok with locking our top API to a Kube consistent scheme, but for objects that update more rapidly it'll be hard to reuse type meta and object meta (if we want to split out our various object sets into multiple components that move upstream or into their own projects, for example).

We can fork the simple types but the deep types (pod spec, part of deployment) are uglier.

On Jun 23, 2015, at 9:00 PM, Muhammed Uluyol notifications@github.com wrote:

However, it registers its objects under the kubernetes api.Scheme (instead
of a separate runtime.Scheme) and is forced to follow the kubernetes
version numbers. Moreover, OpenShift's "kinds" must be unique to avoid
clashing with those in kubernetes.

I don't understand this part. Scheme doesn't care if version matches
across objects or not, and you can register a kind with whatever name you
want, you don't have to use the name of the go object.

OpenShift needs to refer to kubernetes objects (if nothing else, those in
api.{Type,Object}Meta, and as things stand, this requires being in the same
version as those objects. OpenShift could use a different version number,
but then the same objects would have to be re-registered under that version
along with conversion and copy functions.
—
Reply to this email directly or view it on GitHub.

lavalamp · 2015-06-24T18:34:15Z

OpenShift needs to refer to kubernetes objects (if nothing else, those in api.{Type,Object}Meta, and as things stand, this requires being in the same version as those objects

I don't get why. ObjectReference has a version field?

smarterclayton · 2015-06-24T20:17:23Z

The biggest reason so far is kind names. Minion vs Node is undecidable if you go through enough version and you're referencing that object from something in a different API scheme (like hypothetical build API at version v3 referencing what node it ran on with Kind "Node")

On Jun 24, 2015, at 2:34 PM, Daniel Smith notifications@github.com wrote:

OpenShift needs to refer to kubernetes objects (if nothing else, those in api.{Type,Object}Meta, and as things stand, this requires being in the same version as those objects

I don't get why. ObjectReference has a version field?

—
Reply to this email directly or view it on GitHub.

uluyol · 2015-06-24T20:25:28Z

I don't get why. ObjectReference has a version field?

I was referring to adding a type from the kubernetes api to another version. I was mixed up about the copy/conversion functions. Those are only needed when using a separate Scheme not when using a different version number.

I went ahead and created a new test version with the following two types:

type Hello struct {
    api.TypeMeta   `json:",inline"`
    api.ObjectMeta `json:"metadata,omitempty"`

    Text     string              `json:"text,omitempty"`
    Text2    string              `json:"test,omitempty"`
    Template *v1.PodTemplateSpec `json:"template,omitempty"`
}

type HelloList struct {
    api.TypeMeta   `json:",inline"`
    api.ObjectMeta `json:"metadata,omitempty"`

    Items []Hello `json:"items,omitempty"`
}

In order to make things work, this required the following call

api.Scheme.AddKnownTypes("exp",
    &Hello{},
    &HelloList{},

    &api.DeleteOptions{},
    &api.Namespace{},
    &api.ListOptions{},
    &api.Status{},
)

This appears to work (with all of the hooks in the master and storage) and I can create Hello Objects under the new version. I had thought more registration was required but I suppose not.

lavalamp · 2015-06-24T21:21:29Z

The biggest reason so far is kind names. Minion vs Node is undecidable if you go through enough version and you're referencing that object from something in a different API scheme (like hypothetical build API at version v3 referencing what node it ran on with Kind "Node")

I guess I must be dense today, because I still don't see what the issue is. So you reference minion:v1beta1. What is wrong with that? Can you give a concrete example? This is hard to reason about in the abstract.

I can see adding a provider to the type, e.g. kubernetes/minion:v1beta1. That makes a lot of sense.

uluyol · 2015-06-25T00:44:03Z

[This post follows a discussion with bgrant0607 and lavalamp]

Summary

Logically, an API object should be uniquely identifiable by an API provider, group, version, and kind. Having API groups allows for independent versioning of different components, while providers are responsible for serving multiple API groups. Supporting multiple providers is not a high priority, and so the focus will be on supporting multiple groups.

It should be possible to achieve this by making runtime.Scheme aware of API groups. Doing so would eliminate any need to have multiple Schemes allowing api.Scheme to continue being the single global instance.

First Steps:

All of the maps in runtime.Scheme (and conversion.Scheme) need to be extended to support groups. TypeMeta will also be given a new field: APIGroup.
A new struct, KindSpec will be introduced that holds the group, version, and kind. This way we can avoid have method names like ObjectGroupVersionAndKind or ObjectGroupVersionKind and reduce the number of arguments and return values.
Methods that will need trivial changes: AddKnownTypes, AddKnownTypeWithName, KnownTypes, DataVersionAndKind (should be renamed DataKindSpec, this will require changes to pkg/conversion/meta.go), ObjectVersionAndKind (likewise), Recognized, New, AddFieldLabelConversionFunc, and ConvertFieldLabel.
ConvertToVersion and EncodeToVersion may need modifications, but they will not accept the group as an argument. In other words, kinds may only be converted within groups, not across groups.

smarterclayton · 2015-07-16T23:16:30Z

On Jul 16, 2015, at 3:08 PM, Muhammed Uluyol notifications@github.com
wrote:

ListOptions / DeleteOptions are shared across multiple API groups (and they
have conversions). The version they are used with may not be in the same
version of their code. Requiring those to be explicitly registered by type
and name in each group isn't the end of the world (unless I missed
something making that impossible).

runtime.Scheme.ObjectVersionAndKind returns the version and kind based on
the Go type, not the values in the object's TypeMeta (link to code that
handles this in pkg/conversion
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/conversion/scheme.go#L384).
Registering the same Go type in multiple groups breaks this behavior. I
don't think we can have multi-group objects unless we plumb the (group,
version, kind) everywhere, in which case we might as well just populate the
TypeMeta.

This is why we support explicit registration today, and why SelfLinking
against a go type is somewhat fraught. I don't think we should be
depending on the static go type to define an object group - I can think of
a ton of reasons to expose objects in multiple groups (such as experimental
API groups and stable API groups). I think it's a mistake to not have to
call into the Scheme (and to provide context from the code) to determine
the group of an object. What matters is the encoding (the serialization)
not the transient in memory representation of the go type.

Maybe I missed the example of why having the go type determine the group is
valuable?

—
Reply to this email directly or view it on GitHub
#10009 (comment)
.

uluyol · 2015-07-16T23:52:26Z

How would ObjectVersionAndKind function (or the group-extended version)? It doesn't make sense to extract version or kind information without the group. Within conversion.Scheme we have maps that go from (version, kind) to go type and go type to (version, kind). Both of these are necessary to support ObjectVersionAndKind's functionality to identify them from just the go type and create new objects using the Scheme. You can extend the first map to identify a (group, version, kind) but then the second one no longer makes sense when a single object is in multiple groups. You could add a parameter with the group name and resolve it that way, but I don't see how that is any better than plumbing the full TypeMeta.

smarterclayton · 2015-07-17T00:22:26Z

Rereading the thread, I think I'm missing a key point that led to the
proposal to separate group. Brian or Dan, can you describe more why group
must be separate and not part of version?

If group is part of apiVersion, ObjectVersionAndKind continues to function
by returning the canonical version for an object (which can be changed
during encode as part of a no-op conversion). Then the different groups
are just aliases of one another and we don't need to track the group
internally. I.e. ("service/v1", "ListOptions") and ("pods/v2",
"ListOptions") can be decoded to the same object (which might canonically
be "common/v1"), and during encode we merely perform a conversion to
("pods/v2", "ListOptions").

On Jul 16, 2015, at 7:52 PM, Muhammed Uluyol notifications@github.com
wrote:

How would ObjectVersionAndKind function (or the group-extended version)? It
doesn't make sense to extract version or kind information without the
group. Within conversion.Scheme we have maps that go from (version, kind)
to go type and go type to (version, kind). Both of these are necessary to
support ObjectVersionAndKind's functionality to identify them from just the
go type and create new objects using the Scheme. You can extend the first
map to identify a (group, version, kind) but then the second one no longer
makes sense when a single object is in multiple groups. You could add a
parameter with the group name and resolve it that way, but I don't see how
that is any better than plumbing the full TypeMeta.

—
Reply to this email directly or view it on GitHub
#10009 (comment)
.

lavalamp · 2015-07-24T23:04:43Z

I can't think of a good argument for keeping version & group separate levels of the hierarchy.

I was going to say that our conversion functions would only work in a group, but I don't think that's really a difficult problem to solve.

Maybe completeness is an argument? We need a way to capture a set of objects that work with each other, and I thought version was that mechanism. We don't want people to declare that their api consists of group/v1.2/foo and group/v1.3/bar. But I guess we can enforce that via convention.

nikhiljindal · 2015-07-27T23:06:57Z

I went through this thread and I have a few questions:

I see a lot of discussion regarding making TypeMeta unversioned, but do we have a decision? Either way, how does making TypeMeta unversioned help with this issue?
I liked @bgrant0607's comment regarding separating api machinery and the spec. runtime.TypeMeta is part of the machinery. Do we want to reuse it in the spec, if we decide to make TypeMeta unversioned?
We need to decide if we want group as a separate field similar to version, or we update version to be "group/version" as proposed in Supporting multiple API prefixes #10009 (comment)

bgrant0607 · 2015-07-28T22:30:55Z

Making TypeMeta unversioned simplifies plumbing through the codebase. Logically, it HAS to be unversioned, so that we can figure out the apiVersion and kind of a given resource across all possible groups, versions, and kinds. For instance, let's say a resource specified "type" and "ver" instead of "kind" and "version". That seems like it would introduce a lot of complexity and ambiguity (esp. since we have some resources with "type" fields).

As for whether we should lump group with version...

Some possible groups and versions (groupings are plausible but group names are just examples):
compute/v1: Pod, PodTemplate, Secret, PersistentVolumeClaim
compute/v2alpha1: ConfigData
lb/v1: Service, Endpoints
lb/v2alpha1: Route
deploy/v1: ReplicationController
deploy/v2alpha1: Deployment, Job, ScheduledJob
admin/v1: Namespace, LimitRange, ResourceQuota, ServiceAccount
info/v1: Event, ComponentStatus
info/v2alpha1: PodStats, NodeStats
cluster/v1alpha1: Daemon
infrastructure/v1: Node, PersistentVolume
extension/v1alpha1: ThirdPartyResource
experimental/v1alpha1: DiurnalScaler

Yes, I suppose that could work. We desperately need this, so I'm all for expedience, provided it doesn't create obvious problems, and I don't see any.

bgrant0607 · 2015-07-28T22:34:01Z

And of course all the corresponding List resources.

dchen1107 · 2015-07-29T17:53:11Z

cc/ @krousey This one is related to what we talked about yesterday on API group and api.Status versioning or unversioning, etc.

nikhiljindal · 2015-07-29T17:54:58Z

ok thanks @bgrant0607. I will review #10582 assuming that we are unversioning TypeMeta and are updating version to be group/version.
Thanks!

fgrzadkowski · 2015-08-26T09:15:01Z

To be honest I don't see what's the value of so many api groups. Why not start with something simple (like introducing experimental) and introduce new groups only if we see a good use-case for it?

lavalamp · 2015-08-26T23:13:51Z

Assigning Chao for review, since he's working on this at the moment.

davidopp · 2015-08-27T06:54:06Z

I think the categories could be made more intuitive with a slight renaming and slight aggregation. How about compute, network, storage, collection, admin, experimental, extension? For example

compute: Pod, PodTemplate
network: Service, Endpoints, Route (does Route exist?)
storage: PersistentVolumeClaim, PersistentVolume
collection: ReplicationController, Deployment, Job, ScheduledJob, Daemon
admin: Namespace, LimitRange, ResourceQuota, ServiceAccount, Event, ComponentStatus, PodStats, NodeStats, Node, ConfigData, Secret
experimental: DiurnalScaler
extension: ThirdPartyResource

pmorie · 2015-08-27T15:51:58Z

Has anyone thought about the changes necessary to existing components that might want to work with new versions of an existing API group?

derekwaynecarr · 2015-08-27T17:14:45Z

I get why we want the API groups, but we do lose something today where each segment of our URL returns a result of some kind, so from /api/v1/namespaces/x/pods/foo, I can chop off a segment and get back a meaningful result that helps make the API sensible. So in the new model here, where you would have /compute/v1/namespaces/x/pods/foo, does /compute/v1/namespaces/x still return a meaningful result, or do I need to go to /admin/v1/namespaces/x?

caesarxuchao · 2015-08-27T17:27:52Z

@derekwaynecarr @pmorie, let's move the talk to #12951, where we have a RFC proposal for API groups.

caesarxuchao · 2015-08-27T17:33:30Z

I think we should only expose namespaces at /admin. My concern is the version. Imagine we have compute/v1alpha3/namespaces/x/pods/foo, but v1alpha3 does not exist in the admin group, what should we return for compute/v1alpha3/namespaces/x?

nikhiljindal · 2015-08-27T17:33:43Z

@derekwaynecarr Here is what I think we can do for that:
You need to go to /admin/v1/namespaces/x to get the namespace x.
/user/v1/namespaces/x returns all valid resources kinds that can be appended to that path, like:

ResourceList:
{
"pods",
"replicationControllers",
.
.
.
}

We have talked about returning a ResourceList before for /api/v1: #7883 (comment)

caesarxuchao · 2015-08-27T17:44:07Z

I'm closing this issue. Let's continue the discussion in #12951.

satnam6502 added the team/master label Jun 18, 2015

bgrant0607 added this to the v1.0-post milestone Jun 19, 2015

bgrant0607 mentioned this issue Jun 19, 2015

"v2" API (API/client redesign umbrella issue) #8190

Closed

bgrant0607 mentioned this issue Jul 23, 2015

RFC: Add initial plugin schema types. #11781

Merged

bgrant0607 removed this from the v1.0-post milestone Jul 24, 2015

uluyol mentioned this issue Jul 29, 2015

Add (stopgap) support for an experimental API prefix. #12001

Merged

caesarxuchao mentioned this issue Aug 7, 2015

[WIP] Make expapi an API group #12413

Closed

uluyol mentioned this issue Aug 12, 2015

Add experimental api support to kubectl #12405

Merged

bgrant0607 mentioned this issue Aug 21, 2015

[RFC] adding a proposal for api groups #12951

Merged

davidopp added team/control-plane and removed team/master labels Aug 22, 2015

lavalamp assigned caesarxuchao Aug 26, 2015

caesarxuchao closed this as completed Aug 27, 2015

bgrant0607 mentioned this issue Sep 2, 2015

Communication between Kubernetes components #6363

Closed

bgrant0607 added this to the v1.1 milestone Sep 12, 2015

Supporting multiple API prefixes #10009

Supporting multiple API prefixes #10009

Comments

uluyol commented Jun 18, 2015

Introduction

Proposed Hierarchy

Challenges

uluyol commented Jun 18, 2015

bgrant0607 commented Jun 19, 2015

bgrant0607 commented Jun 19, 2015

bgrant0607 commented Jun 19, 2015

smarterclayton commented Jun 19, 2015

bgrant0607 commented Jun 19, 2015

bgrant0607 commented Jun 19, 2015

bgrant0607 commented Jun 19, 2015

smarterclayton commented Jun 19, 2015

bgrant0607 commented Jun 19, 2015

bgrant0607 commented Jun 19, 2015

bgrant0607 commented Jun 19, 2015

bgrant0607 commented Jun 19, 2015

lavalamp commented Jun 24, 2015

lavalamp commented Jun 24, 2015

uluyol commented Jun 24, 2015

smarterclayton commented Jun 24, 2015

lavalamp commented Jun 24, 2015

smarterclayton commented Jun 24, 2015

uluyol commented Jun 24, 2015

lavalamp commented Jun 24, 2015

uluyol commented Jun 25, 2015

Summary

smarterclayton commented Jul 16, 2015

uluyol commented Jul 16, 2015

smarterclayton commented Jul 17, 2015

lavalamp commented Jul 24, 2015

nikhiljindal commented Jul 27, 2015

bgrant0607 commented Jul 28, 2015

bgrant0607 commented Jul 28, 2015

dchen1107 commented Jul 29, 2015

nikhiljindal commented Jul 29, 2015

fgrzadkowski commented Aug 26, 2015

lavalamp commented Aug 26, 2015

davidopp commented Aug 27, 2015

pmorie commented Aug 27, 2015

derekwaynecarr commented Aug 27, 2015

caesarxuchao commented Aug 27, 2015

caesarxuchao commented Aug 27, 2015

nikhiljindal commented Aug 27, 2015

caesarxuchao commented Aug 27, 2015