Compare revisions

Jeremy Auguste · Jeremy Auguste · Jeremy Auguste · Jeremy Auguste · Jeremy Auguste · Jeremy Auguste
--- a/LICENSE
+++ b/LICENSE
+MIT License
+
+Copyright (c) 2018 Jeremy Auguste
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
--- a/README.md
+++ b/README.md
+This repository groups a set of tools to easily use OAR.
+
+
+# Requirements
+
+Obviously, OAR needs to be installed on the system to be able to make use of these commands.
+
+Here are the requirements for each script:
+
+-   `oargen` requires python 2+ or 3+. It uses the `oarsub` command from OAR.
+-   `oarstats` requires python 2+ or 3+ and the *pyyaml* package. It uses the `oarstat` command from OAR.
+-   `batchoar` requires python 3+ and the *pyyaml* package. It uses the `oarsub` command from OAR.
+
+
+# Usage
+
+
+## OARGEN
+
+The basic usage of `oargen` is simply `python oargen.py -r command` where *command* is the job
+to execute on the cluster.
+
+Several options can be used:
+
+-   --run: run the job on the cluster.
+-   --time *TIME*: reservation walltime of your job. Accepted formats: *h*, *h:m* or *h:m:s*. Defaults to 10 hours.
+-   --core *NB\_CORES*: number of cores required by your job. Defaults to 1.
+-   --interative: launch job in interative mode instead of passive.
+-   --gpu: request GPUs.
+-   --host *host1* [ *host2* ... *hostn* ]: name of the hosts allowed to be used on the cluster.
+-   --ignore-host *host1* [ *host2* ... *hostn* ]: name of the hosts forbidden to be used on the cluster.
+-   --anterior *ANTERIOR\_ID*: job will only be launched once the job with the specified ID is finished.
+-   --checkpoint *delay*: enable the checkpoint signal with the given delay (in seconds).
+-   --name *name*: name of the job.
+-   --directory *directory*: directory in which will be stored the logs of the standard output and input.
+
+
+## OARSTATS
+
+The basic usage of `oarstats` is simply `python oarstats.py`.
+It will print some information about the number of jobs, cores and gpus used by
+each user. It will also take into account the different queues.
+
+One option can be used:
+
+-   --show-hosts: Adds an extra line for each user indicating which hosts are being used.
+
--- a/README.org
+++ b/README.org
+#+options: toc:nil _:{} ^:{} -:nil
+
+This repository groups a set of tools to easily use OAR.
+
+* Requirements
+
+Obviously, OAR needs to be installed on the system to be able to make use of these commands.
+
+Here are the requirements for each script:
+ ~oargen~ requires python 2+ or 3+. It uses the ~oarsub~ command from OAR.
+ ~oarstats~ requires python 2+ or 3+ and the /pyyaml/ package. It uses the ~oarstat~ command from OAR.
+ ~batchoar~ requires python 3+ and the /pyyaml/ package. It uses the ~oarsub~ command from OAR.
+
+* Usage
+
+** OARGEN
+
+The basic usage of ~oargen~ is simply ~python oargen.py -r command~ where /command/ is the job
+to execute on the cluster.
+
+Several options can be used:
+ --run: run the job on the cluster.
+ --time /TIME/: reservation walltime of your job. Accepted formats: /h/, /h:m/ or /h:m:s/. Defaults to 10 hours.
+ --core /NB_CORES/: number of cores required by your job. Defaults to 1.
+ --interative: launch job in interative mode instead of passive.
+ --gpu: request GPUs.
+ --host /host1/ [ /host2/ ... /hostn/ ]: name of the hosts allowed to be used on the cluster.
+ --ignore-host /host1/ [ /host2/ ... /hostn/ ]: name of the hosts forbidden to be used on the cluster.
+ --anterior /ANTERIOR_ID/: job will only be launched once the job with the specified ID is finished.
+ --checkpoint /delay/: enable the checkpoint signal with the given delay (in seconds).
+ --name /name/: name of the job.
+ --directory /directory/: directory in which will be stored the logs of the standard output and input.
+
+** OARSTATS
+
+The basic usage of ~oarstats~ is simply ~python oarstats.py~.
+It will print some information about the number of jobs, cores and gpus used by
+each user. It will also take into account the different queues.
+
+One option can be used:
+ --show-hosts: Adds an extra line for each user indicating which hosts are being used.
--- a/batchoar.py
+++ b/batchoar.py
+#!/usr/bin/env python
+# coding: utf-8
+
+import argparse
+import yaml
+import itertools
+import logging
+from collections import deque
+import os
+
+import oargen
+
+
+def argparser():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('job_file',
+                        help="YAML file which contains all the jobs to launch")
+    parser.add_argument('-b', '--besteffort', action="store_true",
+                        help="Launch job in besteffort mode")
+    parser.add_argument('-t', '--time', default="10",
+                        help="Default maximum duration of the jobs (format: h[:m[:s]])")
+    parser.add_argument('-g', '--gpu', action="store_true",
+                        help="If True, reserves only cores with GPUs")
+    parser.add_argument('-c', '--core', default=1, type=int,
+                        help="Number of cores to reserve")
+    parser.add_argument('-H', '--host', nargs="+", default=[],
+                        help="Name of the hosts (SQL 'LIKE' syntax accepted)")
+    parser.add_argument('-I', '--ignore-host', nargs="+", default=[],
+                        help="Name of the hosts to ignore (SQL 'NOT LIKE' syntax accepted)")
+    parser.add_argument('-i', '--interactive', action="store_true",
+                        help="Launch job in interactive mode")
+    parser.add_argument('-C', '--checkpoint', type=int, metavar="SECONDS",
+                        help="Enable checkpoint signals with the given delay (in seconds)")
+    parser.add_argument('-j', '--max-jobs', type=int, default=-1)
+    parser.add_argument('--print-commands', action="store_true",
+                        help="Print each individual command")
+    parser.add_argument('-r', '--run', action="store_true",
+                        help="Run the command")
+    parser.add_argument('-d', '--directory',
+                        help="Creates/specifies a directory and stores oarsub outputs in it.")
+    parser.add_argument('-l', '--logger', default='INFO',
+                        choices=['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'],
+                        help="Logging level: DEBUG, INFO (default), WARNING, ERROR")
+    args = parser.parse_args()
+
+    numeric_level = getattr(logging, args.logger.upper(), None)
+    if not isinstance(numeric_level, int):
+        raise ValueError("Invalid log level: {}".format(args.logger))
+    logging.basicConfig(level=numeric_level)
+
+    return args
+
+
+def load_variables(yaml_variables):
+    variable_names = []
+    variables = []
+    for name, values in yaml_variables.items():
+        variable_names.append(name)
+        variables.append(values)
+
+    return variable_names, list(itertools.product(*variables))
+
+
+def load_flags(yaml_flags, args):
+    flags = {}
+    for command_flag, oar_flag in yaml_flags.items():
+        if getattr(args, oar_flag):
+            flags[command_flag] = command_flag
+        else:
+            flags[command_flag] = ""
+    return flags
+
+
+def load_job(job_name, job, var_names, var_tuples, constants, flags, precommand):
+    for name, value in constants.items():
+        job = job.replace("${}$".format(name), str(value))
+        job_name = job_name.replace("${}$".format(name), str(value))
+    for flag, str_flag in flags.items():
+        job = job.replace("@{}@".format(flag), str_flag)
+        job_name = job_name.replace("@{}@".format(flag), str_flag)
+
+    if precommand:
+        job = "{} {}".format(precommand, job)
+
+    jobs = []
+    for var_tuple in var_tuples:
+        new_job = job
+        new_job_name = job_name
+        for var_name, var in zip(var_names, var_tuple):
+            new_job = new_job.replace("%{}%".format(var_name), str(var))
+            new_job_name = new_job_name.replace("%{}%".format(var_name), str(var))
+        jobs.append((new_job_name, new_job))
+    return jobs
+
+
+def load_jobs(yaml_in, args):
+    yaml_jobs = yaml.load(yaml_in)
+
+    var_names, var_tuples = load_variables(yaml_jobs["variables"])
+    constants = yaml_jobs["constants"] if "constants" in yaml_jobs else {}
+    flags = load_flags(yaml_jobs["flags"], args)
+    try:
+        precommand = yaml_jobs["environment"]["precommand"]
+    except KeyError:
+        precommand = ""
+
+    jobs = []
+    for job_name, job in yaml_jobs["jobs"].items():
+        jobs.extend(load_job(job_name, job, var_names, var_tuples, constants, flags, precommand))
+
+    return jobs
+
+
+def extract_job_id(cmd_output):
+    for line in cmd_output.splitlines():
+        if line.startswith("OAR_JOB_ID="):
+            return line.split('=')[1]
+    return None
+
+
+def create_directory(directory, fake_run=False):
+    if directory is None or fake_run:
+        return
+    if os.path.isdir(directory):
+        return
+    os.mkdir(directory)
+
+
+def main():
+    args = argparser()
+
+    with open(args.job_file) as fin:
+        jobs = load_jobs(fin, args)
+
+    anteriors = deque()
+    fake_id_counter = 0
+
+    create_directory(args.directory, fake_run=not args.run)
+
+    for job_name, job in jobs:
+        if args.max_jobs < 0 or len(anteriors) < args.max_jobs:
+            anterior = None
+        else:
+            anterior = anteriors.pop()
+
+        oar_command = oargen.prepare_oarsub(args.gpu, args.host, args.core, args.time,
+                                            ignore_hosts=args.ignore_host,
+                                            command=job, name=job_name,
+                                            output_directory=args.directory,
+                                            besteffort=args.besteffort,
+                                            checkpoint=args.checkpoint, anterior=anterior)
+
+        cmd_output = oargen.run_oarsub(oar_command, print_cmd=args.print_commands,
+                                       fake_run=not args.run, return_output=True)
+        if cmd_output is not None:
+            job_id = extract_job_id(cmd_output)
+            if job_id is None:
+                logging.warning("Job '{}' seems to have failed to launch...".format(job_name))
+            else:
+                anteriors.appendleft(job_id)
+                logging.debug("OAR_JOB_ID={}".format(job_id))
+                if anterior is not None:
+                    logging.debug("ANTERIOR_JOB_ID={}".format(anterior))
+        elif args.run:
+            logging.warning("Job '{}' didn't return anything...".format(job_name))
+        else:
+            fake_id = "FAKE_ID_{}".format(fake_id_counter)
+            fake_id_counter += 1
+            anteriors.appendleft(fake_id)
+    logging.info("Scheduled {} jobs...".format(len(jobs)))
+
+
+if __name__ == '__main__':
+    main()
--- a/example_jobs.yml
+++ b/example_jobs.yml
+constants:
+  dev: corpora/dev_c1200.delim.feats.daadless.hdf5
+  train: corpora/dev_c1200.delim.feats.daadless.hdf5
+variables:
+  hidden: [64, 128, 256]
+  optimizer: ["Adam", "Adadelta"]
+flags:
+  --cuda: gpu
+environment:
+  precommand: pylauncher pytorch3
+jobs:
+  train.rnn.attn.%optimizer%.c1200_h%hidden%.ws.conseil: python train.py @--cuda@ --bidirectional --optimizer %optimizer% --conv-dim %hidden% --embeddings-dim 100 3 --input-datanames x_word x_speaker --output-datanames y_conseil -e 7 --checkpoint checkpoints/c1200_h%hidden%.ws.%optimizer%.daadless.conseil.mdl --valid $dev$ $train$ models/c1200_h%hidden%.ws.%optimizer%.daadless.conseil.mdl
+  train.rnn.attn.%optimizer%.c1200_h%hidden%.ws.nps_3class: python train.py @--cuda@ --bidirectional --optimizer %optimizer% --conv-dim %hidden% --embeddings-dim 100 3 --input-datanames x_word x_speaker --output-datanames y_3class_nps -e 7 --checkpoint checkpoints/c1200_h%hidden%.ws.%optimizer%.daadless.conseil.nps_3class.mdl --valid $dev$ $train$ models/c1200_h%hidden%.ws.%optimizer%.daadless.nps_3class.mdl
+
--- a/oargen.py
+++ b/oargen.py
@@ -4,90 +4,133 @@
 from __future__ import print_function
 from __future__ import unicode_literals

+import os
 import argparse
-# import logging
 import subprocess


 def argparser():
    parser = argparse.ArgumentParser()
-    parser.add_argument('command', nargs='?',
+    # parser.add_argument('command', nargs='?',
+    #                     help="Command to use for the job (in passive mode)")
+    parser.add_argument('command', nargs=argparse.REMAINDER,
                        help="Command to use for the job (in passive mode)")
-    parser.add_argument('argument', nargs=argparse.REMAINDER,
-                        help="Arguments of the command (in passive mode)")
    parser.add_argument('-n', '--name',
                        help="Name to give to the job")
+    parser.add_argument('-d', '--directory',
+                        help="Directory in which will be stored oarsub outputs")
    parser.add_argument('-b', '--besteffort', action="store_true",
                        help="Launch job in besteffort mode")
-    parser.add_argument('-t', '--time', default=12, type=int,
-                        help="Estimated maximum duration of the job (in hours)")
+    parser.add_argument('-t', '--time', default="10",
+                        help="Estimated maximum duration of the job (format: h[:m[:s]]) (default: %(default)s)")
    parser.add_argument('-g', '--gpu', action="store_true",
                        help="If True, reserves only cores with GPUs")
    parser.add_argument('-c', '--core', default=1, type=int,
-                        help="Number of cores to reserve")
-    parser.add_argument('-H', '--host',
-                        help="Name of the host (SQL LIKE syntax accepted)")
+                        help="Number of cores to reserve. (default: %(default)s)")
+    parser.add_argument('-H', '--host', nargs="+", default=[],
+                        help="Name of the hosts (SQL 'LIKE' syntax accepted)")
+    parser.add_argument('-I', '--ignore-host', nargs="+", default=[],
+                        help="Name of the hosts to ignore (SQL 'NOT LIKE' syntax accepted)")
    parser.add_argument('-i', '--interactive', action="store_true",
                        help="Launch job in interactive mode")
    parser.add_argument('-C', '--checkpoint', type=int, metavar="SECONDS",
                        help="Enable checkpoint signals with the given delay (in seconds)")
    parser.add_argument('-r', '--run', action="store_true",
                        help="Run the command")
-    # parser.add_argument('-l', '--logger', default='INFO',
-    #                     choices=['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'],
-    #                     help="Logging level: DEBUG, INFO (default), WARNING, ERROR")
+    parser.add_argument('-a', '--anterior',
+                        help="Anterior job id that must be terminated to start this new one")
    args = parser.parse_args()

-    # numeric_level = getattr(logging, args.logger.upper(), None)
-    # if not isinstance(numeric_level, int):
-    #     raise ValueError("Invalid log level: {}".format(args.logger))
-    # logging.basicConfig(level=numeric_level)
-
    return args


-def main():
-    args = argparser()
-
-    command = ["oarsub"]
-    command.append("-p")
+def prepare_oarsub(gpu, hosts, core, time,
+                   ignore_hosts=[],
+                   command=None,
+                   interactive=False,
+                   name=None, output_directory=None,
+                   besteffort=False,
+                   checkpoint=None, anterior=None):
+    oar_cmd = ["oarsub"]
+    oar_cmd.append("-p")
    properties = ""
-    if args.gpu:
+    if gpu:
        properties += "(gpu IS NOT NULL)"
    else:
        properties += "(gpu IS NULL)"
-    if args.host is not None:
-        properties += " AND host LIKE '{}'".format(args.host)
+    if hosts:
+        properties += " AND ("
+        for idx, host in enumerate(hosts):
+            if idx != 0:
+                properties += " OR "
+            properties += "host LIKE '{}'".format(host)
+        properties += ")"
+    if ignore_hosts:
+        for host in ignore_hosts:
+            properties += " AND host NOT LIKE '{}'".format(host)
    properties += ""
-    command.append(properties)
+    oar_cmd.append(properties)
+
+    oar_cmd.append("-l")
+    time = time.split(':')
+    hour = time[0]
+    minutes = time[1] if len(time) >= 2 else "00"
+    seconds = time[2] if len(time) >= 3 else "00"
+    ressources = "core={},walltime={}:{}:{}".format(core, hour, minutes, seconds)
+    oar_cmd.append(ressources)
+
+    if name is not None:
+        oar_cmd.append("-n")
+        oar_cmd.append(name)
+        directory = output_directory + "/" if output_directory is not None else ""
+        oar_cmd.append("-O")
+        oar_cmd.append("{}{}.%jobid%.stdout".format(directory, name))
+        oar_cmd.append("-E")
+        oar_cmd.append("{}{}.%jobid%.stderr".format(directory, name))
+
+    if besteffort:
+        oar_cmd.extend(["-t", "besteffort", "-t", "idempotent"])
+
+    if checkpoint is not None:
+        oar_cmd.extend(["--checkpoint", checkpoint])
+
+    if anterior is not None:
+        oar_cmd.extend(["-a", anterior])
+
+    if interactive:
+        oar_cmd.append('-I')
+    else:
+        job_command = command if isinstance(command, str) else " ".join(command)
+        oar_cmd.append(job_command)
+    return oar_cmd

-    command.append("-l")
-    ressources = "core={},walltime={}:00:00".format(args.core, args.time)
-    command.append(ressources)

-    if args.name is not None:
-        command.append("-n")
-        command.append(args.name)
-        command.append("-O")
-        command.append("{}.%jobid%.stdout".format(args.name))
-        command.append("-E")
-        command.append("{}.%jobid%.stderr".format(args.name))
+def run_oarsub(command, print_cmd=False, fake_run=False, return_output=False):
+    if print_cmd:
+        print(subprocess.list2cmdline(command))
+    if fake_run:
+        return None
+    if not return_output:
+        subprocess.call(command)
+        return None
+    return subprocess.check_output(command).decode("utf8")

-    if args.besteffort:
-        command.extend(["-t", "besteffort", "-t", "idempotent"])

-    if args.checkpoint is not None:
-        command.extend(["--checkpoint", args.checkpoint])
+def main():
+    args = argparser()

-    if args.interactive:
-        command.append('-I')
-    else:
-        job_command = [args.command] + args.argument
-        command.append(" ".join(job_command))
+    if args.directory is not None and not os.path.isdir(args.directory) and args.run:
+            raise RuntimeError("'{}' is not a directory!".format(args.directory))

-    print(subprocess.list2cmdline(command))
-    if args.run:
-        subprocess.call(command)
+    oar_command = prepare_oarsub(args.gpu, args.host, args.core, args.time,
+                                 ignore_hosts=args.ignore_host,
+                                 command=args.command,
+                                 interactive=args.interactive,
+                                 name=args.name, output_directory=args.directory,
+                                 besteffort=args.besteffort, checkpoint=args.checkpoint,
+                                 anterior=args.anterior)
+
+    run_oarsub(oar_command, print_cmd=True, fake_run=not args.run)


 if __name__ == '__main__':

--- a/oarstats.py
+++ b/oarstats.py
+#!/usr/bin/env python
+# coding: utf-8
+
+from __future__ import print_function
+from __future__ import unicode_literals
+
+import argparse
+import logging
+import sys
+try:
+    import yaml
+except ImportError:
+    print("Package 'pyyaml' needs to be installed to be able to use this script!", file=sys.stderr)
+    exit(1)
+import subprocess
+from collections import defaultdict
+import re
+import time
+import datetime
+
+
+class Owner:
+    def __init__(self, name):
+        self.name = name
+        self.queues = defaultdict(list)
+        self.karma = defaultdict(float)
+        self.timeleft = 0
+        self.running = defaultdict(int)
+        self.running_cores = defaultdict(int)
+        self.running_gpus = defaultdict(int)
+        self.resources = defaultdict(int)
+        self.gpu = defaultdict(int)
+        self.devices = defaultdict(int)
+
+    def add_job(self, job):
+        self.queues[job.queue].append(job)
+        if job.karma > self.karma[job.queue]:
+            self.karma[job.queue] = job.karma
+        self.timeleft += job.wall_time - job.elapsed_time
+        self.resources[job.queue] += job.resources
+        if job.elapsed_time != 0:
+            self.running[job.queue] += 1
+            self.running_cores[job.queue] += job.resources
+            if job.gpu:
+                self.running_gpus[job.queue] += job.resources
+        if job.gpu:
+            self.gpu[job.queue] += job.resources
+        for device in job.devices:
+            self.devices[device] += 1
+
+    def print_info(self, show_devices=False):
+        print("User {} :: Total Time Reserved: {}".format(self.name, datetime.timedelta(seconds=self.timeleft)))
+        for queue in self.queues.keys():
+            print("\t{} - Running: {} jobs ({} cores, {} gpus), Total: {} jobs ({} cores, {} gpus) - Karma: {}".format(queue,
+                                                                                                                       self.running[queue],
+                                                                                                                       self.running_cores[queue],
+                                                                                                                       self.running_gpus[queue],
+                                                                                                                       len(self.queues[queue]),
+                                                                                                                       self.resources[queue],
+                                                                                                                       self.gpu[queue],
+                                                                                                                       self.karma[queue]))
+        if show_devices and self.devices:
+            print("Running on: {}".format(" ".join(["{}:{}".format(device, amount) for device, amount in self.devices.items()])))
+
+
+class Job:
+    def __init__(self, job_id, elapsed_time, wall_time, resources, devices, gpu, queue, karma):
+        self.job_id = job_id
+        self.elapsed_time = elapsed_time
+        self.wall_time = wall_time
+        self.resources = resources
+        self.devices = devices
+        self.gpu = gpu
+        self.queue = queue
+        self.karma = karma
+
+
+def argparser():
+    parser = argparse.ArgumentParser()
+    parser.add_argument('--show-hosts', action="store_true")
+    parser.add_argument('-l', '--logger', default='INFO',
+                        choices=['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'],
+                        help="Logging level: DEBUG, INFO (default), WARNING, ERROR")
+    args = parser.parse_args()
+
+    numeric_level = getattr(logging, args.logger.upper(), None)
+    if not isinstance(numeric_level, int):
+        raise ValueError("Invalid log level: {}".format(args.logger))
+    logging.basicConfig(level=numeric_level)
+
+    return args
+
+
+def main():
+    args = argparser()
+
+    stats_output = subprocess.check_output(["oarstat", "--yaml"]).decode('utf-8')
+    stats_yaml = yaml.safe_load(stats_output)
+
+    owners = {}
+    resources_pattern = re.compile('R=([0-9]+)')
+    walltime_pattern = re.compile('W=([0-9]+(:[0-9]+(:[0-9]+)?)?)')
+    queue_pattern = re.compile('Q=(\S+)')
+    karma_pattern = re.compile('Karma=([0-9]+\.[0-9]+)')
+    gpu_pattern = re.compile('gpu is not null', flags=re.IGNORECASE)
+    for job_id, job_info in stats_yaml.items():
+        if job_info["owner"] not in owners:
+            owners[job_info["owner"]] = Owner(job_info["owner"])
+        elapsed_time = 0 if job_info["startTime"] == 0 else time.time() - job_info["startTime"]
+        if job_info["message"] == '':
+            continue
+        tokens = re.search(walltime_pattern, job_info["message"]).group(1).split(':')
+        wall_time = int(tokens[0]) * 3600 + int(tokens[1]) * 60 + int(tokens[2])
+        resources = int(re.search(resources_pattern, job_info["message"]).group(1))
+        try:
+            queue = re.search(queue_pattern, job_info["message"]).group(1)
+        except AttributeError:
+            queue = "besteffort"
+        try:
+            karma = float(re.search(karma_pattern, job_info["message"]).group(1))
+        except AttributeError:
+            karma = 0.0
+        devices = job_info["assigned_network_address"]
+        gpu = re.search(gpu_pattern, job_info["properties"]) is not None
+        job = Job(job_id, elapsed_time, wall_time, resources, devices, gpu, queue, karma)
+        owners[job_info["owner"]].add_job(job)
+
+    for owner in owners.values():
+        owner.print_info(show_devices=args.show_hosts)
+        print()
+
+
+if __name__ == '__main__':
+    main()
--- a/pylauncher.sh
+++ b/pylauncher.sh
 #! /bin/zsh

 if [[ $# -lt 2 ]]; then
-    echo "Usage: $0 environment_name command [arg1..argn]" >&2
+    echo "Usage: $0 environment_name [OPTIONS] command [arg1..argn]" >&2
+    echo "OPTIONS:" >&2
+    echo "  -h, --help           Show this message" >&2
+    echo "  -p, --cuda-path      Specify the path where cuda is installed (default: /usr/local/cuda-10.1)" >&2
+    echo "  -a, --anaconda-path  Specify the path where anaconda is installed (default: /storage/raid1/homedirs/jeremy.auguste/.anaconda3)" >&2
    exit 1
 fi

+cuda_path="/usr/local/cuda-10.1"
+anaconda_path="/storage/raid1/homedirs/jeremy.auguste/.anaconda3"
+
+while [[ $1 == -* ]]; do
+    case "$1" in
+        -p|--cuda-path)
+            cuda_path="$2"
+            shift 2
+            ;;
+        -a|--anaconda-path)
+            anaconda_path="$2"
+            shift 2
+            ;;
+        -h|--help)
+            echo "Usage: $0 environment_name [OPTIONS] command [arg1..argn]"
+            echo "OPTIONS:"
+            echo "  -h, --help        Show this message"
+            echo "  -p, --cuda-path   Specify the path where cuda is installed (default: /usr/local/cuda-10.1)"
+            echo "  -a, --anaconda-path  Specify the path where anaconda is installed (default: /storage/raid1/homedirs/jeremy.auguste/.anaconda3)"
+            exit 0
+            ;;
+        *)
+            echo "Error: Unknown option: $1" >&2
+            echo "Usage: $0 environment_name [OPTIONS] command [arg1..argn]" >&2
+            exit 1
+            ;;
+    esac
+done
+
+
 environment="$1"
 shift 1

-source activate $environment
+source "$anaconda_path/etc/profile.d/conda.sh"
+
+conda activate $environment

 if [[ $environment =~ ^keras.* || $environment =~ ^pytorch.* ]]; then
-    export CUDA_HOME=/usr/local/cuda-8.0
+    export CUDA_HOME=$cuda_path
    export CUDA_ROOT=$CUDA_HOME
 	export PATH=$CUDA_HOME/bin:$PATH
 	export MANPATH=$CUDA_HOME/doc/man:$MANPATH
No results found