Skip to content

Latest commit

 

History

History
148 lines (101 loc) · 6 KB

ch02_installation.asciidoc

File metadata and controls

148 lines (101 loc) · 6 KB

Chapter 2: Prepare the Environment for Chemoinformatics

We will build the environment required for this document.

About Anaconda

Anaconda is a package for easy environment creation and management for doing machine learning. You can also easily install packages, like RDKit, which will be explained later.

Q&A

Why use Anaconda?

The programming language Python has a relatively large number of standard libraries, but you need to install the libraries for chemoinformatics yourself. This is not a big deal if you get used to it, but it will be troublesome for beginners. Anaconda comes into play in order to reduce this effort.

There are two major versions of Python: 2.x and 3.x.

Support for 2.x will end in 2020, so new learners do not need to use 2.x.

How to install Anaconda

Now let’s install Anaconda. Visit the official site and download the Python 3 installer for your environment. If the OS is Linux / Mac, you can select the installer of GUI / CUI, so download Python 3.7 64-bit command line installer.

APX+RVX
$ bash ~/Downloads/Anaconda3-4.1.0-Linux-x86_64.sh # Please change the installer name accordingly

Press Enter

Welcome to Anaconda3 2018.12

In order to continue the installation process, please review the license
agreement.
Please, press ENTER to continue
>>>

Continue to press Enter and enter yes with yes, no

Do you accept the license terms? [yes|no]
[no] >>>

I am asked where to install, but the default location is usually fine. Press Return.

Anaconda3 will now be installed into this location:
/Users/kzfm/anaconda3

  - Press ENTER to confirm the location
  - Press CTRL-C to abort the installation
  - Or specify a different location below

You will be asked if you want to install VSCode after installation as well, so press No.

Thank you for installing Anaconda3!

===========================================================================

Anaconda is partnered with Microsoft! Microsoft VSCode is a streamlined
code editor with support for development operations like debugging, task
running and version control.

To install Visual Studio Code, you will need:
  - Internet connectivity

Visual Studio Code License: https://code.visualstudio.com/license

Do you wish to proceed with the installation of Microsoft VSCode? [yes|no]
>>> Please answer 'yes' or 'no':
>>>

Once the Anaconda installation is complete, you will be able to use the 'conda' command from a command prompt or terminal.

Build a Virtual Environment and Install a Package

Python installed with Anaconda is 3.7, but the latest RDKit distributed at the time of this writing requires Python 3.6. So build a virtual environment with conda and install the required version of Python. After the -n of the command is "py4chemoinformatics", but you can use any name you like. After creating the virtual environment, install the packages used in this chapter and later.

$ conda create -n py4chemoinformatics python3.6
$ source activate py4chemoinformatics # Mac/Linux
$ activate py4chemoinformatics # Windows

# install packages
$ conda install -c conda-forge rdkit
$ conda install -c conda-forge seaborn
$ conda install -c conda-forge ggplot
$ conda install -c conda-forge git

Description of installed package

RDKit

RDKit is one of the most commonly used toolkits in the field of chemoinformatics. One of the so-called open source software (OSS), which can be used free of charge. For more information Please refer to Introduction.

seaborn

It is one of the packages for visualizing statistical data.

ggplot

One of the graph drawing packages is that it can draw rationally with a consistent grammar . Originally developed for the statistical analysis language R, it was ported to Python by the company yhat .

Git

It is a version control system. I will not explain Git in this book, but if you do not know Git at all , take a look at Git Primer, which can be understood by monkeys.

As explained in "Introduction", all data including pdf will be downloaded by the following command, so please download it as necessary.

$ git clone https://github.com/Mishima-syk/py4chemoinformatics.git

Learn more about Conda

Why create a virtual environment

Some systems use Python internally to provide various features, so changing the Python version for a particular package can cause problems. Virtual environments solve these problems. Even if the package requires different library versions, you can set up a virtual Python environment for trial and error. If it becomes unnecessary, the virtual environment can be easily deleted without causing any problems in the original environment. So, by being able to create separate development environments in one system, you will not be bothered by library dependencies problems and Python version differences that often occur during development.

In this document, only one virtual environment is prepared for this document, but in practice many virtual environments are often created and developed. Therefore, I will list the conda subcommands that I use frequently.

$ conda install <package name> # install package
$ conda create -n <Name-of-virtual-environment> python = <version> # Create virtual environment.
$ conda info -e  # Display virtual environment list created
$ conda remove -n <environment-name> # Virtual environment deletion
$ source activate <environment-name> # Using virtual environment ( Mac/Linux)
$ activate <environment-name> # Using virtual environment (Windows)
$ source deactivate # leaving virtual environment
$ conda list # Display a list of libraries installed in the virtual environment you are using now