Use the scheduler to execute packages periodically starting at a specified date and time. The packages will be executed as scheduled, using an existing cluster that fits the scheduled cluster size or if one doesn't exist, a cluster will be provisioned automatically with the number of specified nodes. By default, the cluster is taken down as soon as package execution is completed.
Your schedule list can be viewed in the schedules list. Any schedule can be disabled or enabled from within the list.To create a new schedule
- Click Schedules on the side menu.
- Click New schedule to open the new schedule dialog.
- Enter a Name for the schedule and optionally a Description.
- Optionally, change the Starts on date and time to start schedule package execution. If the schedule repeats every day or week, subsequent executions will start at this time. If the schedule repeats every hour, subsequent executions will start at the specified minute after the hour. Note that the time is in UTC.
- Set the recurrence of yoru schedule by changing Repeat every's value and units.
- By default, schedules will not execute job if previous jobs executed by the same schedules are running. Check Allow execution overlapping if you want the schedule to execute jobs regardless of previous jobs status.
- Move the Cluster Size slider to the number of cluster nodes to use for package execution.
- By default, the cluster will terminate after 1 minute of inactivity. If the package execution and schedule recurrence is lower than 1 hour, we recommend to turn automatic cluster termination off so the cluster can be reused.
- Set Re-use strategy:
- Any cluster created by this schedule - use a cluster created by this schedule if one is available. Otherwise, create a cluster.
- Any similar cluster (default) - use any existing cluster so long as it's at least as big as you set in the schedule's cluster size. Otherwise, create a cluster.
- Never - create a new cluster every time the schedule is running.
- Click add package to add at least one package to execute.
- Choose a package from the list and click set variables.
- Set the value for any user variables or system variables and click Save. The variable values for a package in the schedule override the package defaults.
Note: Variable values are expressions that are useful to calculate relative datetime values which can be very useful in your scheduled jobs. For example:
ToDate(ToString(SubtractDuration(CurrentTime(),'P1D'),'yyyy-MM-dd'))- returns a datetime value of Yesterday midnight
ToString(SubtractDuration(CurrentTime(),'P1D'),'yyyy/MM/dd')- returns a string in the form of yyyy/mm/dd to use in a path for yesterday's data.
AddDuration(ToDate('2000-01-01'),REPLACE(' PnM','n',(chararray)MonthsBetween(CurrentTime(),ToDate('2000-01-01'))))- returns a datetime value of the first day of the month
- Optionally add additional packages.
- Change Status to on to enable the schedule.
- Click Create schedule.
Configure Schedule Cluster
The schedule can create a new cluster to run jobs, or use existing clusters.
Select packages to run
To view and maintain schedules
In the schedules list you can see all of your schedules with execution information. You can enable, disable, edit, duplicate and delete each of your schedules. You can also run a schedule one-off from the schedules list.