Configuring bgdata¶
bgdata has a default configuration file which looks like:
version=2
local_repository = "~/.bgdata"
remote_repository = "http://bbglab.irbbarcelona.org/bgdata"
However, you can create you own configuration file and change it.
Custom configuration¶
To create you own custom configuration you need to create a
file bgdatav2.conf
and place in the corresponding
config file folder (this is done using the
appdir package
using the user_config_dir
function with bbglab
as the only parameter).
That file, should follow the same structure as the default, but you can change the sections to fit you own needs.
The local folder (where the data packages are stored)
is indicated through local_repository
.
# The default local folder where you want to store the data packages
local_repository = "~/.bgdata"
Note
You can put any reachable path.
The remote repository is a (public) URL where the data packages are stored and the bgdata uses to look for the packages that are not in the local repository.
# The remote URL from where do you want to download the data packages
remote_repository = "http://bbglab.irbbarcelona.org/bgdata"
If you need to access to the remote repo through a proxy you can also configure it as follows:
# Optional proxy configuration
# [proxy]
host = proxy.someurl.org
port = 8080
# If it's an authenticated proxy
user = myname
pass = mypasswd
Optionally, bgdata can be set to not look for newer versions of the packages in the remote repository and only use what is available on the local. To make use of this option, you need to add:
# If you want to force bgdata to work only locally
offline = True
Using the cache_repositories
option you can indicate
a list of repositories (similar to the local) in which
to look for the files.
# Cache repositories
[cache_repositories]
# Pairs name and path
my hard drive = /mnt/user/hd
Note
cache repositories have higher priority than the local, meaning that bgdata will look in them before checking the local. In addition, they are search last to first.
As an example of usage, data packages that are being used recurrently
in our cluster are saved in the scratch
directory of
each node. This way, bgdata takes the data from the scratch
which is faster than using the network file system.