Skip to content
Vivek Raghuram edited this page Dec 1, 2016 · 33 revisions

contact: [email protected]


Depending on your interests, you'll need to download/clone different repositories of the general ECG system. This page attempts to describe the main interests, as well as the required repositories and dependencies for each.


System Requirements

These requirements apply for every part below except Grammar Design. The system has been tested on OSX and Linux environments.

The following packages are required:

You will also need the following repository from GitHub:

  • ECG Framework
  • ecg-grammars
    • Note: the analyzer.sh script assumes this is installed in the same directory as ecg_framework_code

Note: Once you clone the repository, you'll also want to set your PYTHONPATH in your bash_profile to point to:

{INSTALL_PATH}/ecg_framework_code/src/main

Grammar Design and Viewing Semantic Specifications

If your primary interest is in viewing and modifying Embodied Construction Grammar (ECG), you may not need to download the rest of the "full-path" system. This is the functionality of earlier ECG releases. Instead, you'll need to clone the following repositories:

  1. ECG Grammars
  2. ECG Workbench Release

ECG Grammars

The first is a repository of hand-built ECG grammars. These grammars are necessary for the ECG Analyzer to produce a Semantic Specification (SemSpec) of an input utterance. To clone this repository, navigate to the directory of your choice and enter the following command:

git clone https://github.com/icsi-berkeley/ecg_grammars.git

This will create a new folder called ecg_grammars on your machine, located in the directory in which you entered the command. If there are updates to the origin repository, you can retrieve them with the following commands:

cd ecg_grammars
git pull

ECG Workbench

The ECG Workbench is a tool for editing ECG grammars and visualizing SemSpecs (more info here). To clone this repository, navigate to the directory of your choice (ideally the same directory in which you cloned the ecg_grammars repository) and enter this command:

git clone https://github.com/icsi-berkeley/ecg_workbench_release.git

This will create a new folder called ecg_workbench_release. It's a large repository, so it may take longer to clone than the ecg_grammars repository. Once it's finished, you'll have access to the ECG Workbench. By default, this repository comes with three models of the workbench, built for different platforms:

  1. Linux
  2. Mac OS X
  3. Windows

In the ecg_workbench_release folder, open up the workbench directory, and then navigate into the folder corresponding to your machine. Open up the application.

For more information about using the ECG Workbench and viewing SemSpecs, check out additional documentation.

Viewing ActSpecs from Existing Grammars

If your primary interest is in viewing the n-tuple data structure, which is the foundational communication paradigm for our natural language understanding system, then you will need at least:

  1. ECG Grammars
  2. ECG Framework

For the first, see above for information on cloning the ECG Grammars repository.

ECG Framework

The ECG Framework repository contains code for the core modules of our NLU system. It requires the ECG Grammars repository to run, as well as other dependencies. Once you've cloned ecg_grammars, clone the ecg_framework_code repository in the same directory:

git clone https://github.com/icsi-berkeley/ecg_framework_code.git

See here for information on running the system; this section explicitly covers the scripts required to view JSON n-tuples.

As mentioned here, you'll want to point your PYTHONPATH (in your .bash_profile) the following:

export PYTHONPATH = {PATH_TO_FRAMEWORK}/src/main:{$PYTHONPATH}

For a general wiki on the framework, see here.

Using a Text-Based Robot Demo

We have created a text-based demo that allows a user to type commands and questions into a Terminal prompt, and receive either answers or printed-out information about the robot's movements, e.g.:

> Robot1, move to the big red box!
FED1_ProblemSolver: robot1_instance is moving to (6.0, 6.0, 0.0)

If you're interested in using this, you'll need at least the following three repositories:

  1. ECG Grammars
  2. ECG Framework
  3. ECG Robot Code

The instructions here provide an overview for installing, setting up, and running the demo.

Like with the ecg_framework_code, you'll want to manually add this to your path:

export PYTHONPATH = {PATH_TO_ROBOT_CODE}/src/main:{$PYTHONPATH}

NOTE: If you don't want to install each repository separately, we suggest you use the ecg_interface repository, which contains the others as submodules. Follow the instructions in the README for installation.

Using a Simulated Robot Demo

We have also used this NLU system for a simulated robot demo (video). The bulk of this work was done using the Morse robotics simulator, but we have also implemented a version in ROS.

For both simulations, you'll need the following repositories:

  1. ECG Grammars
  2. ECG Framework
  3. ECG Robot Code

Morse

If you're interested in running the system with the Morse simulator, you'll want to follow the installation instructions here.

ROS

For information on how to install and run our ROS demo, check out these instructions. Note that these instructions include the installation of ecg_interface, which includes the ecg_framework_code, ecg_grammars, and ecg_robot_code directories (but does not include ecg_workbench_release).

Integrating to a New Product

Ultimately, it's possible that you are interested in adapting the NLU system to an entirely new application. Part of the fundamental motivation for this research is to facilitate relatively simple re-targeting of a language understanding system to new domains. We have done this between Morse and ROS (see above), which are both in the "robotics" domain, and are working on an implementation for Starcraft, which is both a new domain and a new application.

Click here for a tutorial on retargeting this system to a new application.

Installing and Running Speech Recognition

We've integrated the Kaldi open source speech recognition toolkit to allow you to speak commands to an autonomous system. It's in the preliminary stages, but we have some files available for experimentation with the Robots demo.

To begin, you will have to download speech recognition models. In the instructions below, replace the single instance of /path/to/directory with a path to where you want to store the models. The models take about 600M.

asrdir=/path/to/directory 
mkdir -p $asrdir
cd $asrdir
wget https://github.com/icsi-berkeley/ecg_asr/releases/download/alpha1/asr_models_20160627.tar.gz -O - | tar xvzf -
cp online_nnet2_decoding.conf online_nnet2_decoding.conf.orig
sed s:/t/janin/ecg/asr/:$(pwd)/: < online_nnet2_decoding.conf.orig > online_nnet2_decoding.conf

You might need to set $asrdir in your .bash_profile.

Next, you'll need to install Kaldi. They've packaged it up to be fairly easy to install for an experienced unix-ish programmer, but it's quite big. You only need a single program, called online2-wav-nnet2-latgen-faster, so, for your convenience, we've generated precompiled binaries for a few platforms. Note that these will likely become out of date in the future, and we do not intend to provide binaries for additional platforms. See http://kaldi-asr.org for how to install Kaldi yourself.

As of June 2016, we have three versions available: OSX_ELCAPITAN is for Macintosh OS X 10.11 "El Capitan"; LINUX_GLIBC_2.23 is for reasonably recent versions of Linux; LINUX_GLIBC_2.12 if for older versions of Linux.

In the below, replace e.g. OSX_ELCAPITAN with the version you want.

wget https://github.com/icsi-berkeley/ecg_asr/releases/download/alpha1/online2-wav-nnet2-latgen-faster_OSX_ELCAPITAN -O online2-wav-nnet2-latgen-faster
chmod uog+x online2-wav-nnet2-latgen-faster
export PATH=$asrdir:$PATH

Finally, you need to have the program sox installed. It's an open source audio processing package available at http://sox.sourceforge.net. As usual for open source software, it's easiest to install it with a package manager (e.g. brew install sox or apt-get install sox).

At this point, you should be able to run speechagent.py, which came packaged in the ecg-framework repository.

speechagent.py -asr $asrdir