Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to start CronJobs manually #47538

Closed
0xmichalis opened this issue Jun 14, 2017 · 29 comments
Closed

Ability to start CronJobs manually #47538

0xmichalis opened this issue Jun 14, 2017 · 29 comments
Assignees
Labels
area/workload-api/cronjob kind/feature Categorizes issue or PR as related to a new feature. sig/apps Categorizes an issue or PR as relevant to SIG Apps.

Comments

@0xmichalis
Copy link
Contributor

It would be nice to support manual runs for CronJobs. Today, IIRC, a CronJob misses its schedule for whatever reason and we won't start it until the next time it's scheduled to run. At the very least, this could be a kubectl generator, eventually we may want to support it as part of the CJ API.

@stevekuznetsov since you were asking about it today

@kubernetes/sig-apps-feature-requests

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. kind/feature Categorizes issue or PR as related to a new feature. labels Jun 14, 2017
@stevekuznetsov
Copy link
Contributor

Another anti-UX feature is if you change the schedule to something like 1/1 * * * * to ensure it runs soon without having to look at the time, the controller notices a huge amount of jobs should have been scheduled and refuses to run anything.

@dennis-bell
Copy link

We are searching for kinda the same feature, we would like to be able to start a cronjob the first time after creation. no matter when it would be scheduled normally.
Doing this with a manual trigger is also fine, as long as we can do it through the API.

@boosh
Copy link

boosh commented Jun 20, 2017

@dennis-bell You could try a helm post-install hook if you're using helm. Run the job once via that hook then also release a cron job at the same time.

@soltysh soltysh self-assigned this Jul 31, 2017
@vinayagg
Copy link

We are also looking for this feature but for different functionality.
We need to run "commands" for projects without excessive command line options or yaml files.

  • job should already be defined and installed
  • they just need one command to trigger it.

@albertvaka
Copy link

This would be also very useful for debugging: When I have to test a cronjob, either I change the schedule to be a few minutes from now and apply it, or I create new YAML file for a pod with the same configuration... both ways are pretty inconvenient.

@UnrealCraig
Copy link

It would be really useful to trigger a cronjob manually

@erhudy
Copy link
Contributor

erhudy commented Sep 27, 2017

Us as well. I'd like to take a hack at this, though I can't assign it to myself.

@erhudy
Copy link
Contributor

erhudy commented Sep 27, 2017

What I'd like to do (after talking the issue over a bit with @ant31) is add an AdhocJob resource type that would reference a particular CronJob for immediate execution, irrespective of the schedule provided for that CronJob.

@ant31
Copy link
Member

ant31 commented Sep 27, 2017

@erhudy proposed a new field triggerOnCreation, which make sens but would partially adress the issue.

The shortest path I see would be create A job forked from the cronjob. It could be handled on the cli side to begin:

kubectl create job --from-cronjob foobar

cc @kubernetes/sig-cli-feature-requests

@vinayagg
Copy link

Another option could be to allow multiple schedules for a cron job. in that case

  • we could use commandline to add a schedule to run immediately
  • it will allow for complex scheduling

@erhudy
Copy link
Contributor

erhudy commented Sep 28, 2017

Based on the above discussion, I hacked together a proof-of-concept at https://github.com/bloomberg/kubernetes/tree/feature/create-ad-hoc-job-from-cronjob. You can execute something like

kubectl create job blah --cronjob some-cronjob

and it will create blah from the JobSpec in some-cronjob. I feel that this is a sufficient solution and it's not necessary to do anything that would result in API changes, because the nature of this problem is inherently oriented around impromptu testing. If you want to do something in an API-driven way, you can already do that right now by getting the CronJob in question and extracting the JobSpec from it.

The POC has the following caveats that I'm aware of:

  • didn't write tests
  • I think there's probably a simpler way to handle the clientset situation than spawning 2 separate ones, one for batch/v2alpha1 for the CronJob and a regular one for the Job, but I am not familiar with clientsets
  • the created Job is not linked to the source CronJob, so the CronJob can still trigger a Job regardless of the parallelism settings - I'm thinking that this would be a good candidate for another flag, e.g. --link-to-cronjob or some such, so that you could spawn a Job either as an explicit descendant of the source CronJob and have it obey parallelism settings, or spawn it completely standalone

If people agree that this seems like a workable solution for the problem in question, I'll continue to flesh it out.

@mattfarina
Copy link
Contributor

Another option if something needs to run once when an app starts is an init container.

@erhudy This is the kind of thing that would be good to discuss at a SIG Apps meeting. If you can find my on slack we can try to work out a time to get you on the schedule.

@soltysh
Copy link
Contributor

soltysh commented Oct 3, 2017

So just to let everyone interested know. We've talked with @erhudy about it on slack. My most important suggestion is that this will be a sub-resource on a cronjob (invoke, instantiate or something like that), similar to how you get logs from a pod. This way it'll be accessible to CLI and other consumers of our API. He will be working on updating the cronjob proposal and the implementation. Hopefully, by Oct 16th (when this is planned to be discussed during SIG Apps call) we'll have the proposal out there and we can discuss it in more details.

@erhudy
Copy link
Contributor

erhudy commented Oct 5, 2017

Restarted development work at https://github.com/erhudy/kubernetes/tree/feature/cronjob-manual-instantiation - currently functional for spawning a Job. Upcoming work will be on:

  • adding the subresource back to the CLI
  • associating the spawned Job back to the source CronJob
  • improving the disposition of the CronJobManualInstantiation subresource (probably embedding the spawned Job inside it)

k8s-github-robot pushed a commit to kubernetes/community that referenced this issue Oct 5, 2017
Automatic merge from submit-queue.

Add section on manually triggering CronJobs

As requested by @soltysh, in relation to kubernetes/kubernetes#47538.
@julia-stripe
Copy link
Contributor

@erhudy very excited about your work on this. One thing to consider -- we implemented a very hacky version of this internally. Our hacky version doesn't respect the cron job's concurrency policy (so if you start the cronjob manually, starting it will succeed even if another instance is already running). Our users have told us that they find this alarming and that they would prefer that the "manual start" feature respect the cron job's concurrency policy.

@erhudy
Copy link
Contributor

erhudy commented Oct 6, 2017

That's how my current implementation works - an instantiate subresource exists on a given CronJob, which when poked spawns an unassociated Job from the JobSpec. For obvious reasons that you mentioned, this isn't really satisfactory. I looked at writing the source CronJob back to the store with a modified CronJobStatus, but this would risk a race condition that could cause the CronJob controller to lose track of other associated jobs.

I'm totally okay with undertaking the work necessary to ensure that the instantiate action creates a new Job in a safe way that gets associated back to the source CronJob, and that it respects the parallelism/concurrency configurations of the CronJob, but I want to bring it up in the scheduled sig-apps meeting to make sure everyone's in agreement first before actually doing it. I think it should be possible to accomplish without too much heartache by refactoring the CronJob controller a bit, and I'll probably play around with a few ideas prior to the meeting.

@erhudy
Copy link
Contributor

erhudy commented Oct 11, 2017

Here's the approach I'm taking right now: https://github.com/erhudy/kubernetes/commit/b603ab7ccbecc010f686f8507d6c0155b7246da9

When a Job is created via the instantiate subresource, another label is applied to the created job. The cronjob controller will pick up jobs with that label on them and add them to the active list of the cronjob specified as the job's owner reference.

Maybe this is sufficient for the general case?

@vinayagg
Copy link

@erhudy In this mechanism, is there a way to pass parameters when invoking manually?

@erhudy
Copy link
Contributor

erhudy commented Oct 11, 2017

What kind of parameters? Right now the action is not parameterized - it looks at the JobSpec embedded in the CronJob and creates a Job from it.

@vinayagg
Copy link

When invoking manually, parameters will make it 10x more effective.
I can see all kind of parameters: e.g.

  • I have a job that never runs automatically but when I manually invoke, I want it to do X, Y or Z based on my needs
  • Normally my cache is updated everyday at 2 am but I want a hook to rebuild my cache if I find it corrupted.
  • Jenkins could trigger a job with build id to auto update the deployment

I am pretty sure there will be lot more use cases community will find with this flexibility.

@erhudy
Copy link
Contributor

erhudy commented Oct 11, 2017

CronJobs are for scheduled jobs. This issue and the PR are meant to provide a hook for people to kick the CronJob, mainly for testing purposes. Your use cases don't sound aligned to the goal of this PR.

Addressing them individually:

  • I have a job that never runs automatically but when I manually invoke, I want it to do X, Y or Z based on my needs
    • If it never runs automatically, it is not a CronJob.
  • Normally my cache is updated everyday at 2 am but I want a hook to rebuild my cache if I find it corrupted.
    • Why does this require parameters, and why would rebuilding the cache if corrupted be a separate action from updating the cache?
  • Jenkins could trigger a job with build id to auto update the deployment
    • This issue is about Job/CronJob resources and is not associated with Deployment resources.

@vinayagg
Copy link

  • If it never runs automatically, it is not a CronJob.
    I agree. However there is no way to "precreate a job that I can run on demand". I have needed this on multiple occasions. But do you see any reason why it cannot meet this requirement?

  • Why does this require parameters, and why would rebuilding the cache if corrupted be a separate action from updating the cache?
    cron job - incremental update
    manually running it - if cron job failed for some reason or if I want to completely flush the cache and rebuild it.

  • This issue is about Job/CronJob resources and is not associated with Deployment resources.
    My point is that a facility to "run some code on demand" in the cluster will open up a lot of possibilities of automation and simplification.

I think there is a real need and it can be fulfilled by cronjob. If it needs minimal changes, I would encourage you to add it and fill the gap.

@rapenchukd
Copy link

The PR looks like a good initial implementation.
I agree with all of @erhudy points.
If you are trying to parameterize a job for a single use- its very obvious that you should not be using a CronJob, at least I cant really see any reason for it. You are just using a CronJob instead of the intended Job. The purpose of the PR (as well as the issue) is to fill in the rare gaps where you have a pre created CronJob that you expect to run- but for whatever reason it does not.

To me- it sounds like in the long run- OUTSIDE the scope of the issue, there needs to be a redesign of CronJobs and Jobs:
[CronJob]s shouldn't be a thing. The concept of "Cron Job" should just be a standard [Job] with a new [schedule] resource that can be applied to it via tagging- like with services to deployments now. This would allow you to just run the "Cron Job" ad-hoc like you would any standard Job. Jobs can either be atomic for one-time use or stored for repeated trigger.

Though I may be insane and need more coffee.

@vinayagg
Copy link

I am thinking that we are debating a higher level question, that is, until it is done the right way (whatever the right way may be, to be debated, the implementation of right way being months or years away), will it help if there was at least one way to do it? Maybe deprecate "wrong" way when something better is available?

@agilgur5
Copy link

agilgur5 commented Nov 2, 2017

Agree with @rapenchukd that CronJobs could be redesigned at some point as they are just a specific type of job, that would likely simplify cases like these.

@ant31
Copy link
Member

ant31 commented Nov 6, 2017

@vinayagg

However there is no way to "precreate a job that I can run on demand".

That's a separate topic: how to precreate resource in the cluster without 'deploying' them
If it was a deployment resource how would you do ?
It's a valid point but more general and different than this issue.

manually running it - if cron job failed for some reason or if I want to completely flush the cache and rebuild it.

You have one CronJob to clear the cache, and one Job (not a Cron) to rebuilt it.
When cache is corrupted you trigger the Job, they are not doing the same task.
In general, parametrization is off-topic, you can still use/build 3rd tools to implement the behavior you describe quite easily.

@soltysh
Copy link
Contributor

soltysh commented Jan 12, 2018

Agree with @rapenchukd that CronJobs could be redesigned at some point as they are just a specific type of job, that would likely simplify cases like these.

There was a discussion about it early on when I was proposing to combine the two (see #11746) but it was then we decided to split it into two separate entities. CronJob is not a special case of a Job. Job is meant to run once and finish. CronJob add additional level of complexity which we decided to express through a separate controller. This issue and the solution will not provide option to modify cronjob during instantiation. They are meant to kickoff a single instance of a Job defined in a CronJob as is, as @erhudy pointed it out.

jingxu97 pushed a commit to jingxu97/kubernetes that referenced this issue Mar 13, 2018
… instance from a CronJob

This changeset adds the command `kubectl create job` with the flag `--from-cronjob`, which allows a user to create a Job from a CronJob via the CLI.
@MartinX3
Copy link

MartinX3 commented Jan 3, 2019

It changed to
kubectl create job test-job --from=cronjob/the-special-cronjob
If someone searched the function, like me, but got confused by the "not working" syntax.

@antoniivanov
Copy link

At the moment manual start of cron job doesn't respect the cron job's concurrency policy.
This is a problem. We are using actively many cron jobs (hundreds) to start short-living processing tasks with different schedules (from every few minutes to every day or week). Those tasks are managed through many people (though separate interface, cron job is the implementation detail). Some of those tasks are started manually and they should not run concurrently.

Is there any trick, passing certain label that I can trick Cron Job Controller with recognizing that the job is its own so it does not start new one when on schedule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workload-api/cronjob kind/feature Categorizes issue or PR as related to a new feature. sig/apps Categorizes an issue or PR as relevant to SIG Apps.
Projects
None yet