.. currentmodule:: pyke.builtins

Builtin Essentials
=======================================
All builtin rules defined in pyke, are functions that schedule jobs to be run by the internal
job scheduler in build time.

Before you can check the function reference, you need to know that ALL builtin functions accept, by
convention, some optional keyword arguments. These keyword arguments give you more control and
flexibility when you write pyke scripts, allowing you to specify extra dependencies, notifying about
extra files being created by some commands, and specifying under which circumstances you want any
command to run.


Including builtins in your code
---------------------------------------------
The recommended way of using the builtins functions in your build scripts is to import everything
from it, usually at the top of the master script. One time in the master script is enough, and all
the functions will be available for all the other pyke build scripts.

Place the following line at the top of the master.pyke file and you are done!

::

  from pyke.builtins import *


Working with paths
---------------------------------------------
Apart from the rules, you are always going to be working with paths; specifying which files will be
created, which ones will be removed, which are dependencies, ... this is why understanding the
convention used for paths is essential. It is very simple to understand, but nontheless still
essential.

  **All paths passed to builtin functions should be absolute paths or paths relative to the pyke
  build file.**

Let me be repeat myself over relative paths. When this document talks about relative paths it means
that **paths are relative to pyke's build file**, even if you include a python module you have
created, and in there there is a function that calls a rule over a relative path, the path will be
relative to the pyke build function being executed, not your module.

Now, since for every pyke build there is one, and only one master file, there is a way to reference
paths relative to the master.pyke file, by using :func:`masterpath` function.

Let me show you a simple example to try to clarify these concepts, you'll see that is really easy.

Imagine the following folder structure:
::

  + /home/user/project/
    + component/
      - build.pyke
    - master.pyke

Let's say this is the **master.pyke**:
::

  include ('component/build.pyke')

  # absolute location
  mkdir ('/home/user/project/output/')

  # relative to this build file
  mkdir ('output/')

  # relative to master.pyke
  mkdir (masterpath ('output/'))


And this is **build.pyke**:
::

  # absolute location
  mkdir ('/home/user/project/output/')

  # relative to build.pyke
  mkdir ('../output/')

  # relative to master.pyke
  mkdir (masterpath ('output/'))

You can easily see that we are always referencing the same location in both examples, using absolute
and relative paths.

Take a look at :ref:`convenient_path_functions` section to see how to use explicit functions to
reference the files you want.

.. _optional-kwargs:

Optional keyword arguments
---------------------------------------------
For convenience there are optional keyword arguments that allow anyone do extra stuff. Optional
keywords start with a underscore symbol (**_**) by convention, to avoid colliding with function
arguments.

Builtin functions accept the following extra parameters:

- **_needs**
  is a list of extra files or folder dependencies that the job needs to run, either because these
  dependencies cannot be deduced by the job rule itself, or because you want to enforce a job
  to wait for another job.

- **_creates** | **_deletes** | **_updates**
  is a list of extra files that will be created, deleted or overwritten after the job runs, but
  which, for some reason, cannot be deduced from the command itself, or that somehow it is
  convenient for your build because you want to force some tasks to wait for other tasks.

    + **_creates** should be used to specify one or more files that will be created or overwritten
      after the job is executed.

    + **_deletes** should be used to specify one or more files that will be removed after the
      job is executed.

    + **_updates** should be used to specify one or more files that will be used both as input and
      as output. They should exist before and after the command is executed, and will be used as an
      input and as an output for the command. This is not the common case, but imagine for example
      that for some reason you want to get a file generated by another command, and do a text
      replacement on it *(in this case, though, doing the replacement to another file would be a
      better idea, but still, you can do it if you need)*.

- **_if**
  is used to specify the build policy that is going to be used. In another words, this string
  specifies under which conditions the function will be executed.


*_if* conditions
++++++++++++++++++++++++++++++++++++++++
Following you can see the list of valid condition values:

- **False**

  jobs will never be executed

- **True**

  jobs will always be executed

- **"empty"**

  *for dirs* jobs will run when dirs are empty (have no files inside) 

  *for files* jobs will run when files are empty (file_size = 0)

- **"updated"** (default for most rules)

  jobs will run when any input is newer than any output or when any output is missing

- **"exists"**

  jobs will be executed if any of the targets exists

- **"missing"**

  jobs will be executed if any target is missing

- **callback(inputs, outputs)**

  a callback function that will be called at runtime, so it can decide whether internal jobs should
  be executed or not. This function receives a list of all *input* files (and dependencies) and a
  list of all *output* files, in no special order. The function should return *True* if the jobs
  need to be executed and *False* otherwise.


Example using basic parameters:

  .. code-block:: python
  
    from pyke.builtins import *

    # "tmp_folder" will be created if it does not exists yet
    mkdir ("tmp_folder")

    # "tmp_folder" will be created everytime you run pyke
    mkdir ("tmp_folder", _if = True)

    # "tmp_folder" will NEVER be created
    mkdir ("tmp_folder", _if = False)

    # "tmp_folder" will be created only when it does not exist
    mkdir ("tmp_folder", _if = "missing")

    # "tmp_folder" will be removed only if it exists
    rmdir ("tmp_folder", _if = "exists")


Example using a callback:

  .. code-block:: python

    from pyke.builtins import *
    import zipfile

    def is_zipped (inputs, outputs):
      """ Returns True when ANY of the input files are compressed and False otherwise """
      for infile in inputs:
        if zipfile.is_zipfile (infile):
          return True
      return False

    # ... some rules ...

    # download a file and uncompress it when the file is zipped
    download (source_url, destination_file)
    uncompress (destination_file, "out", _if = is_zipped)

In the example above, *destionation_file* will only be decompressed when *is_zipped* returns True,
which in this case means, that *destination_file* has been compressed using ZIP format.

Run commands serially
-------------------------------------

.. autofunction:: pyke.build.serialStart
  :noindex:
  
.. autofunction:: pyke.build.serialEnd
  :noindex: