只有兩個支持GPU資源的mesos框架:Marathon和Aurora。我想在具有GPU資源的mesos代理上啓動批處理作業。所以,只有Aurora支持這種類型的工作。但奧羅拉目前尚未正式支持dcos。我試圖整合但不成功。 DCOS Mesos主人不註冊Aurora框架,但參展商爲Aurora創建記錄。我沒有設法在mesos master logs中找到有關Aurora的任何記錄。這裏是我的極光調度器配置:將Apache Aurora與dcos集成
#!/bin/bash
GLOG_v=0
LIBPROCESS_PORT=8083
#LIBPROCESS_IP=127.0.0.1
JAVA_HOME=/opt/mesosphere/active/java/usr/java
JAVA_OPTS="-server -Djava.library.path='/opt/mesosphere/lib;/usr/lib;/usr/lib64'"
PATH=$PATH:/opt/mesosphere/bin
MESOS_NATIVE_JAVA_LIBRARY=/opt/mesosphere/lib/libmesos.so
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/mesosphere/lib
JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/mesosphere/lib
# Flags control the behavior of the Aurora scheduler.
# For a full list of available flags, run /usr/lib/aurora/bin/aurora-scheduler -help
AURORA_FLAGS=(
# The name of this cluster.
-cluster_name='My Cluster'
# The HTTP port upon which Aurora will listen.
-http_port=8088
# The ZooKeeper URL of the ZNode where the Mesos master has registered.
-mesos_master_address=zk://master_ip1:2181,master_ip2:2181,master_ip3:2181/mesos
# The ZooKeeper quorum to which Aurora will register itself.
-zk_endpoints=master_ip1:2181,master_ip1:2181,master_ip1:2181
# The ZooKeeper ZNode within the specified quorum to which Aurora will register its
# ServerSet, which keeps track of all live Aurora schedulers.
-serverset_path='/aurora/scheduler'
# Allows the scheduling of containers of the provided type.
-allowed_container_types='DOCKER,MESOS'
-allow_docker_parameters=true
-allow_gpu_resource=true
-executor_user=root
### Native Log Settings ###
# The native log serves as a replicated database which stores the state of the
# scheduler, allowing for multi-master operation.
# Size of the quorum of Aurora schedulers which possess a native log. If running in
# multi-master mode, consult the following document to determine appropriate values:
#
# https://aurora.apache.org/documentation/latest/deploying-aurora-scheduler/#replicated-log-configuration
-native_log_quorum_size=2
# The ZooKeeper ZNode to which Aurora will register the locations of its replicated log.
-native_log_zk_group_path='/aurora/replicated-log'
# The local directory in which an Aurora scheduler can find Aurora's replicated log.
-native_log_file_path='/var/lib/aurora/scheduler/db'
# The local directory in which Aurora schedulers will place state backups.
-backup_dir='/var/lib/aurora/scheduler/backups'
### Thermos Settings ###
# The local path of the Thermos executor binary.
-thermos_executor_path='/usr/bin/thermos_executor'
# Flags to pass to the Thermos executor.
-thermos_executor_flags='--announcer-ensemble 127.0.0.1:2181')
我還沒有聽說有人試圖在DC/OS上運行Aurora。所以你可能是第一個。它計劃讓節拍器支持GPU資源,但實際上它可能不會在未來幾個月。 – KarlKFI