svc-sundial 0.2.7

A service for scheduling and running Docker-based jobs

  • Contact: N/A
  • License: N/A

Resources

healthcheck

Operations

Method and Path Description
GET /_internal_/metrics

process

Operations

Method and Path Description
GET /api/processes/

Retrieves a list of process runs based on search parameters.

GET /api/processes/:process_id

Retrieves a process run

POST /api/processes/:process_id/retry

Retries a failed process run by restarting failed tasks; tasks will be given a single additional attempt

POST /api/processes/:process_id/kill

Terminates an active process

process_definition

Operations

Method and Path Description
GET /api/process_definitions/

Get all currently registered process definitions

GET /api/process_definitions/:process_definition_name

Get a registered process definition

PUT /api/process_definitions/:process_definition_name

Updates or creates a process definition

DELETE /api/process_definitions/:process_definition_name

Deletes a registered process definition

POST /api/process_definitions/:process_definition_name/trigger

Triggers a new instance of the process

POST /api/process_definitions/:process_definition_name/pause

Pause the process schedule

POST /api/process_definitions/:process_definition_name/resume

Resume the process schedule

task

Operations

Method and Path Description
GET /api/tasks/

Retrieves the most recent tasks meeting the given criteria

POST /api/tasks/:task_id/log_entries

Appends log entries for a task; intended for use within the task executable

POST /api/tasks/:task_id/metadata

Appends metadata entries for a task; intended for use within the task executable

POST /api/tasks/:task_id/succeed

Marks the task as having succeeded

POST /api/tasks/:task_id/fail

Marks the task as having failed

Headers

No headers

Imports

No imports

Enums

default_emr_job_flow_role

The default EMR EC2 Instance role

Name Value Description
default_emr_job_flow_role EMR_EC2_DefaultRole

default_emr_service_role

The default EMR Service role

Name Value Description
default_emr_service_role EMR_DefaultRole

emr_application

List of applications to install in the EMR cluster

Name Value Description
Hadoop Hadoop

Hive Hive

Mahout Mahout

Pig Pig

Spark Spark

emr_release_label

Version (aka label in AWS-EMR) of the EMR stack to create.

Name Value Description
emr-5.14.0 emr-5.14.0

emr-5.13.0 emr-5.13.0

emr-5.12.1 emr-5.12.1

emr-5.12.0 emr-5.12.0

emr-5.11.1 emr-5.11.1

emr-5.11.0 emr-5.11.0

emr-5.10.0 emr-5.10.0

emr-5.9.0 emr-5.9.0

emr-4.9.2 emr-4.9.2

emr-4.9.1 emr-4.9.1

notification_options

Name Value Description
always always

Always notify when a process completes

on_failure on_failure

Notify when a process fails

on_state_change on_state_change

Notify when a process goes from succeeding to failing and vica versa

on_state_change_and_failures on_state_change_and_failures

Notify when going from failing to succeeded and on each failure

never never

Never notify

on_demand

The onDemand type of Market Type

Name Value Description
on_demand on_demand

process_overlap_action

Name Value Description
wait wait

The process should wait until the currently running instance finishes

terminate terminate

The currently running process should be killed

process_status

Name Value Description
running running

The process has tasks currently executing

succeeded succeeded

All of the process's tasks succeeded on its last run

failed failed

At least one of the process's tasks failed on its last run

task_status

Name Value Description
submitted submitted

The task has been submitted

runnable runnable

starting starting

pending pending

The task is waiting on compute resources

running running

The task is currently executing or awaiting backoff

failed failed

The task has irrevocably failed

succeeded succeeded

The task has succeeded without serious errors

Interfaces

No interfaces

Models

batch_image_command

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
image string Yes -

Name of docker image including registry URL if needed

tag string Yes latest

Tag on docker image

command [string] Yes -

Command to pass to Docker container

memory integer Yes -

vCpus integer Yes -

job_role_arn string No -

ARN of an IAM role, see http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html

environment_variables [environment_variable] Yes []

environment variables to be passed to container

job_queue string No -

Override default job queue, eg: for priority queue or GPU instances queue

continuous_schedule

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
buffer_seconds integer No -

The minimum amount of time (in seconds) that must pass between executions of the process

cron_schedule

Example Json: Minimal | Full

Interfaces: None

See http://quartz-scheduler.org/api/2.2.0/org/quartz/CronExpression.html

Field Type Required? Default Description
day_of_week string Yes -

month string Yes -

day_of_month string Yes -

hours string Yes -

minutes string Yes -

custom_emr_job_flow_role

Example Json: Minimal | Full

Interfaces: None

Custom, preconfigured, EMR EC2 instance role

Field Type Required? Default Description
role_name string Yes -

custom_emr_service_role

Example Json: Minimal | Full

Interfaces: None

Custom, preconfigured, EMR Service role

Field Type Required? Default Description
role_name string Yes -

docker_image_command

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
image string Yes -

tag string Yes latest

command [string] Yes -

memory integer No -

cpu integer No -

taskRoleArn string No -

log_paths [string] Yes []

environment_variables [environment_variable] Yes []

email

Example Json: Minimal | Full

Interfaces: None

An email to send notifications to

Field Type Required? Default Description
name string Yes -

email string Yes -

notify_when notification_options Yes on_state_change_and_failures

emr_command

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
emr_cluster emr_cluster Yes -

The EMR cluster on which to launch the job (aka step)

job_name string Yes -

NAme to be assigned to the Job

region string Yes us-east-1

the AWS Region

class string Yes -

The spark job's Main.class to run

Example: org.apache.spark.examples.SparkPi

s3_jar_path string Yes -

the s3 path to the jar of the job to be run

Example: s3://my-emr-source-bucket/my-emr-jar.jar

spark_conf [string] Yes -

options that will be sent to spark as --conf key=value, e.g. --conf spark.driver.extraJavaOptions=-Denvironment=integration

Example: ["spark.driver.extraJavaOptions=-Denvironment=integration"]

args [string] Yes -

command line arguments to be passe to the job's main class

Example: ["--executor-memory", "1g"]

s3_log_details s3_log_details No -

AWS EMR periodically (currently every 5 minutes) zips up all the logs and upload them into S3. This does not work well with sundial's live logs. Make use of S3 Logs to visualise in real time job's log on sundial's live logs panel

load_data [s3_cp] No -

will trigger a s3 dist cp job for every entry in this array, can be used to load data from s3 into HDFS. requires Hadoop as application on the cluster

save_results [s3_cp] No -

will trigger a s3 dist cp job for every entry in this array, can be used to save data from HDFS to s3. requires Hadoop as application on the cluster

emr_instance_group_details

Example Json: Minimal | Full

Interfaces: None

Additional configuration for Master/Core/Task EMR instance Groups.

Field Type Required? Default Description
emr_instance_type string Yes -

instance_count integer Yes -

aws_market aws_market Yes -

ebs_volume_size integer No -

optional EBS volume in GB to attach to the emr instance, for simplicity only one volume for the given size and type gp2 will be created.

environment_variable

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
variable_name string Yes -

value string Yes -

existing_emr_cluster

Example Json: Minimal | Full

Interfaces: None

Existing EMR cluster configuration object

Field Type Required? Default Description
cluster_id string Yes -

The EMR cluster id

Example: j-2CRR69WTM7N31

healthcheck

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
status string Yes -

log_entry

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
log_entry_id uuid Yes -

Uniquely identifies the log message to prevent duplication

when date-time-iso8601 Yes -

source string Yes -

message string Yes -

metadata_entry

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
metadata_entry_id uuid Yes -

Uniquely identifies the metadata entry to prevent duplication

when date-time-iso8601 Yes -

key string Yes -

value string Yes -

new_emr_cluster

Example Json: Minimal | Full

Interfaces: None

New EMR cluster configuration object

Field Type Required? Default Description
name string Yes -

release_label emr_release_label Yes emr-5.10.0

applications [emr_application] Yes -

s3_log_uri string Yes -

master_instance emr_instance_group_details Yes -

core_instance emr_instance_group_details No -

task_instance emr_instance_group_details No -

ec2_subnet string No -

To be set in case the EMR cannot be launched in the default VPC

emr_service_role emr_service_role Yes -

role to be assigned to the service, the default one should cover most of the cases

emr_job_flow_role emr_job_flow_role Yes -

role to be assigned to the ec2 instances composing the EMR cluster. The default one should cover most of the cases

visible_to_all_users boolean Yes false

whether or not to show the new cluster in the list of clusters. For more details see emr docs

pagerduty

Example Json: Minimal | Full

Interfaces: None

Pager Duty integration

Field Type Required? Default Description
service_key string Yes -

num_consecutive_failures integer Yes 1

api_url string Yes https://events.pagerduty.com

process

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
process_id uuid Yes -

process_definition_name string Yes -

start_time date-time-iso8601 Yes -

status process_status Yes -

task [task] Yes -

process_definition

Example Json: Minimal | Full

Interfaces: None

A grouping of related tasks that are run as a single unit on the same schedule

Field Type Required? Default Description
process_definition_name string Yes -

paused boolean No -

If true, ignore schedule and only start process if triggered manually

process_description string No -

schedule process_schedule No -

The schedule that the process runs on; if not specified, the process will only run when triggered manually

task_definitions [task_definition] Yes -

overlap_action process_overlap_action Yes wait

notifications [notification] No -

s3_cp

Example Json: Minimal | Full

Interfaces: None

Wrapper of the S3DistCp spark job for more info visit the aws s3-dist-cp page

Field Type Required? Default Description
source string Yes -

Source folder or file to copy. It can be either a s3 or HDFS location

Example: s3:///my-bucket-root - hdfs://tmp/my-temp-folder

destination string Yes -

Source folder or file to copy. It can be either a s3 or HDFS location

Example: s3:///my-bucket-root - hdfs://tmp/my-temp-folder

s3_log_details

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
log_group_name string Yes -

log_stream_name string Yes -

shell_script_command

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
script string Yes -

environment_variables [environment_variable] No -

spot

Example Json: Minimal | Full

Interfaces: None

the maximum spot bid price

Field Type Required? Default Description
bid_price decimal Yes -

task

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
task_id uuid Yes -

process_id uuid Yes -

process_definition_name string Yes -

task_definition_name string Yes -

start_time date-time-iso8601 Yes -

end_time date-time-iso8601 No -

previous_attempt_count integer Yes -

log_entries [log_entry] Yes -

metadata_entries [metadata_entry] Yes -

execution_state [metadata_entry] No -

Internal bookkeeping metadata used for task scheduling (e.g. ECS task ID and cluster name)

status task_status Yes -

task_definition

Example Json: Minimal | Full

Interfaces: None

An individual task that runs as part of a process

Field Type Required? Default Description
task_definition_name string Yes -

The canonical name for this task used by other tasks to identify this task

dependencies [task_dependency] Yes -

The tasks that must have completed prior to this one beginning

executable task_executable Yes -

max_attempts integer Yes -

max_runtime_seconds integer No -

The execution time (for a single attempt) after which the system will kill the task

backoff_base_seconds integer Yes -

backoff_exponent double Yes 1

require_explicit_success boolean Yes false

If true, the task must explicitly update its status with Sundial in order to succeed.

task_dependency

Example Json: Minimal | Full

Interfaces: None

Field Type Required? Default Description
task_definition_name string Yes -

success_required boolean Yes true

Unions

aws_market

Interfaces: None

  • Type discriminator: N/A

The EC2 market of the EC2 instances

Type Discriminator Value Example Json Description
on_demand on_demand Minimal | Full

spot spot Minimal | Full

emr_cluster

Interfaces: None

  • Type discriminator: N/A

Whether or not to create a new EMR cluster or re-use a pre-existing one

Type Discriminator Value Example Json Description
new_emr_cluster new_emr_cluster Minimal | Full

existing_emr_cluster existing_emr_cluster Minimal | Full

emr_job_flow_role

Interfaces: None

  • Type discriminator: N/A

The EMR role to use on the EC2 instances

Type Discriminator Value Example Json Description
default_emr_job_flow_role default_emr_job_flow_role Minimal | Full

custom_emr_job_flow_role custom_emr_job_flow_role Minimal | Full

emr_service_role

Interfaces: None

  • Type discriminator: N/A

The EMR service role to use

Type Discriminator Value Example Json Description
default_emr_service_role default_emr_service_role Minimal | Full

custom_emr_service_role custom_emr_service_role Minimal | Full

notification

Interfaces: None

  • Type discriminator: N/A
Type Discriminator Value Example Json Description
email email Minimal | Full

pagerduty pagerduty Minimal | Full

process_schedule

Interfaces: None

  • Type discriminator: N/A

A specification for when a process should be run

Type Discriminator Value Example Json Description
cron_schedule cron_schedule Minimal | Full

continuous_schedule continuous_schedule Minimal | Full

task_executable

Interfaces: None

  • Type discriminator: N/A
Type Discriminator Value Example Json Description
docker_image_command docker_image_command Minimal | Full

Docker image to run on ECS with Sundial companion container

deprecated:

Running jobs on AWS ECS is now deprecated. Use AWS Batch via batch_image_command instead.

shell_script_command shell_script_command Minimal | Full

Shell command to run on Sundial service instance (Experimental)

batch_image_command batch_image_command Minimal | Full

Docker image to run on AWS Batch

emr_command emr_command Minimal | Full

Command to submit Spark Jobs on AWS EMR

Annotations

No annotations