Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test & Merge Dev Branch #36

Open
wants to merge 59 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
57ba914
Upgrade file target to 5.35
Feb 6, 2019
29e4291
Initialize all vars in JQ target
Feb 6, 2019
6aadb0c
File 5.35 bugfixes- works end to end
Feb 6, 2019
978ca89
Update jq config to run differently, fix typo in example input
Feb 6, 2019
5dbecb8
Cleanup libyaml config, different cli flags
Feb 6, 2019
afdaf49
Add bzip2 target
Feb 6, 2019
5d4073f
Add filemagic config and inputs
Feb 6, 2019
50c4123
Add new input for file
Feb 6, 2019
625123b
Yara fixups script
Feb 6, 2019
0b69a0f
Add tinyexpr target (WIP)
Feb 6, 2019
1ccf6c7
Code cleanup, updates to afl, file and toy targets
Apr 1, 2019
96c287f
Cleanup competition script
Apr 8, 2019
af35b63
Add blog link to docs as suggested in #24
Apr 8, 2019
35c91ad
Cleanup hack in competition script
Apr 8, 2019
17deb19
error in docs
tleek Apr 12, 2019
d01f047
Merge branch 'newtargets' of github.com:panda-re/lava into newtargets
tleek Apr 12, 2019
c6474ff
Bugfix in afl config
Apr 12, 2019
8ceecd9
Merge branch 'newtargets' of github.com:panda-re/lava into newtargets
Apr 12, 2019
c31843b
Add sqlite target (WIP)
Apr 28, 2019
1a8b2c3
Add sqlite target (WIP)
Apr 28, 2019
baebf8a
Testing new approach to query-insertion for sqlite. Worked earlier bu…
Apr 28, 2019
bb60157
Don't inject into builtin args (to fix rvalue issues with attack_poin…
Apr 30, 2019
4a8a387
Bugfix in libjpeg config, pass build dir to inj fixup script
Apr 30, 2019
ef04101
Add interactive exceptions to lava.py
May 1, 2019
56e2be5
Add new validate step (WIP) to do some santiy checking on the target …
May 1, 2019
1436e89
Update sqlite config to have valid input programs
May 1, 2019
885aebc
Vars.sh: expected exit code is 0 if unspecified
May 1, 2019
92619ca
competition use fast bug selection
May 1, 2019
98726a6
Update lavaTool to get containing function name better
May 1, 2019
5d1dcbd
Bugfix in interactive mode
May 31, 2019
b343013
Verbose LAVALOG comments
Jun 3, 2019
1927d8b
Update wheezy-backports mirror
Apr 29, 2019
bee4ae0
Fix competition build scripts to be more reproducable
Jul 12, 2019
a09f311
Initial covbug implementation
Jul 12, 2019
8cdfa8a
Update coverage tools to inject coverage bugs. Add prebuilt sw-btrace64
Jul 17, 2019
5b7a557
Added validate step and new tests to validate all projects
Sep 3, 2019
690cda2
Merge branch 'newtargets' of github.com:panda-re/lava into newtargets
Sep 3, 2019
7ed066b
Add missing validate script
Sep 3, 2019
6c58a4e
New error messages for starting injection when you've skipped steps
Sep 3, 2019
5a5bbfd
Add more error checks
Sep 4, 2019
3b36b88
Minor cleanup to lava.sh variables and how we resett taint in db
Sep 5, 2019
5a4b416
FBI: Print percent complete to bug-mining log
Sep 5, 2019
db6fc35
Reset taint DB faster
Sep 5, 2019
f6a9c66
Minor updates to FBI
Sep 5, 2019
ac39e64
Minor bugfix
Sep 5, 2019
f90aa1c
Bugfix in validte
Sep 5, 2019
806e022
Bugfix in tinyexpr makefile
Sep 5, 2019
0cfacd5
Competition: automatically remove bad bugs
Sep 5, 2019
fab51c0
Disable dataflow for a few targets
Sep 5, 2019
6fd631a
Remove interactive exceptions so we can just catch them and retry
Sep 5, 2019
29633b0
Minor cleanup to bug mining
Sep 5, 2019
4aa0161
Better string replacements from config file
Sep 5, 2019
b77abaf
Bugfix in competition
Sep 5, 2019
fff6cba
Unquote booleans in config files. Turn dataflow off for some targets
Sep 6, 2019
7384e16
Update replace macros and bad_bin_search. Fixup file-5.35 makefile
Oct 3, 2019
fee46c0
Install odb dependencies before building fbi
Oct 4, 2019
c17d700
Add missing dependency, auto run init_host.py from setup and update docs
Oct 4, 2019
093ea44
missing deps
tleek Oct 4, 2019
509a648
build script
tleek Oct 4, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,5 @@ CTestTestfile.cmake
/scripts/getfns.pickle


/target_bins/*
/target_injections/*
!/target_bins/.gitkeep
!/target_injections/.gitkeep
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ changes to your system. Once it finishes, you should have
[PANDA](https://github.com/panda-re/panda) installed into
`panda/build/` (PANDA is used to perform dynamic taint analysis).

Next, run `init-host.py` to generate a `host.json`.
Setup will automatically run `init_host.py` to generate a `host.json`.
This file is used by LAVA to store settings specific
to your machine. You can edit these settings as necessary, but the default
values should work.
Expand All @@ -47,7 +47,9 @@ If you want to inject bugs into a new target, you will likely need to make some
modifications. Check out [How-to-Lava](docs/how-to-lava.md) for guidance.

# Documentation
Check out the [docs](docs/) folder to get started.
Check out the [docs](docs/) folder to get started. You may also be interested
in [this blog post](http://moyix.blogspot.com/2016/06/how-to-add-a-million-bugs-to-a-program.html)
for a high-level overview of LAVA.


# Current Status
Expand Down
49 changes: 0 additions & 49 deletions SETUP.md

This file was deleted.

2 changes: 2 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
pip install colorama
./setup.py
4 changes: 4 additions & 0 deletions covbugs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
scratch
file
libjpeg
*.info
22 changes: 22 additions & 0 deletions covbugs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Guide to adding coverage bugs

# Motivation
LAVA can only add bugs to paths we know how to explore.
By adding simple bugs to functions we can't get to, we can try getting comeptitiors to generate inputs to improve our coverage.

# Process
## Build target
Build a bug-free, non-preprocessed, version of the target binary. Modify the makefile to also log coverage information by adding `--coverage -fprofile-arcs -ftest-coverage` to CFLAGS
Also (or just instead) build in docker with `sw-btrace` and then `sw-btrace-to-compiledb` to generate a `compile_commands.json` as well

## Collect all coverage
Modify `cov.sh` in order to measure total coverage across all submitted inputs for all versions of the target program. This will probably take a few hours.
Example usage: `cov.sh file/file-5.35/ file/inputs/*`

## Parse coverage data
Run `parse_cov.py` to transform lcov's output into a python pickle `uncovered.pickle` which identifies all uncovered functions and their lines
In docker, run `add_covbugs.py` to generate yaml for all bugs.
In docker, in the target soruce directory, run `clang-apply-replacements .` to update the source

## Produce buggy target
Make the target (without coverage flags) and fuzz it for a bit to see we find some of the bugs
43 changes: 43 additions & 0 deletions covbugs/add_covbugs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/usr/bin/env python

import pickle
import os
import sys
from subprocess import check_output

# For each uncovered function, try to add a single covbug into it with lavaTool

assert(len(sys.argv) == 3), "USAGE: {} [LavaBase] [SrcRoot]".format(sys.argv[0])
lavaBase = sys.argv[1]
srcRoot = sys.argv[2]

results = pickle.load(open("uncovered.pickle","rb"))
# {filename:
# uncovered_lines: [1,2,3,...]
# uncovered_funcs: {name: [line1, line2...], ...}}

# Generate yaml changes with lavaTool
for f, details in results.items():
f = os.path.join(srcRoot, f)
assert os.path.isfile(f), "Couldn't find file {}".format(os.path.join(srcRoot, f))

if not f.endswith(".c"):
continue

# lavaTool filename.c --covbug func_name:l1,l2,l3
# should inject a covbug before any returns in l1, l2, or l3
covbugs = []
#print(details)
for fn, func_details in details["funcs"].items():
if func_details['execs'] == 0: # Uncovered
newcmd = "{}:[{}]".format(fn, ",".join([str(x) for x in func_details['uncovlines']]))
covbugs.append(newcmd)

if len(covbugs):
cmd = os.path.join(lavaBase, "tools/install/bin/lavaCovBugsLoc") + " {} --funcs={}".format(f, ",".join(covbugs))
try:
check_output(cmd, shell=True)
except Exception as e:
print(e)

# Actually apply changes
42 changes: 42 additions & 0 deletions covbugs/cov.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/bin/bash
#set -x
set -e

# Usage ./cov [root_dir] [inputs]
# Configure PROG and PROG_DIR below

DIR=$1
pushd $DIR

shift

#echo $@

#make clean | true
CFLAGS=--coverage make install
PROG_DIR="sqlite/src/src"
PROG="sqlite"

popd

rm -f result.info
rm -rf scratch | true
mkdir scratch

doit=false

for input in $(ls $@); do
echo $input
safename=$(basename $input)

${PROG_DIR}/${PROG} < ${input} | true # Non-zero exits are allowed
geninfo ${PROG_DIR} -o scratch/cov_$safename.info

if [ -e result.info ]; then # If exists, append
lcov --add-tracefile scratch/cov_$safename.info -t test_$safename -a result.info -t old -o result.info
else # Else just copy
lcov --add-tracefile scratch/cov_$safename.info -t test_$safename -o result.info
fi
done

rm -rf scratch
81 changes: 81 additions & 0 deletions covbugs/parse_cov.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
#!/usr/bin/env python2

# Parse an lcov file to make a pickle mapping filenames
# to a list of entirely uncovered functions and then a list
# of lines in each of the functions

# Run anywhere (host/docker)

import sys
import pickle

assert(len(sys.argv) == 3), "USAGE: {} gcov_result base_path".format(sys.argv[0])
with open(sys.argv[1]) as infile:
lines = infile.readlines()

startpath = sys.argv[2]

results = {} # {filename:
# uncovered_lines: [1,2,3,...]
# funcs: {fun1: {start: X, end: y, uncovlines: [], execs: 0}, ...}

curfile = None
curfunc = None

for line in lines:
line = line.strip()
if line.startswith("SF:"):
if curfile: print(results[curfile]) # Print at the end of each
curfile = line.split("SF:")[1].replace(startpath, "") # Trim to just filename
results[curfile] = {"funcs": {}, "uncovered_lines": []}

elif line.startswith("FN:"):
(first_loc, func_name) = line.split("FN:")[1].split(",")
first_loc=int(first_loc)

for prior_func in results[curfile]["funcs"].keys():
if results[curfile]["funcs"][prior_func]["end"] == None:
results[curfile]["funcs"][prior_func]["end"] = first_loc-1 # End the prior func

results[curfile]["funcs"][func_name] = {"start": first_loc, "end": None,
"uncovlines": [], "execs": False}

elif line.startswith("FNDA:"):
(count, func_name) = line.split("FNDA:")[1].split(",")
count=int(count)
if func_name not in results[curfile]["funcs"]:
continue
#assert(curfile in results.keys()), "Missing file {}".format(curfile)
#assert(func_name in results[curfile]["funcs"]), "Missing funname {} in {}".format(func_name, curfile)
results[curfile]["funcs"][func_name]["execs"] = count

elif line.startswith("DA:"):
(loc, count) = [int(x) for x in line.split("DA:")[1].split(",")]

# Map this source line back to its containing func
curfunc = None
if not len(results[curfile]["funcs"].items()):
print("Warning skipping file {} because it has no functions".format(curfile))
continue

for func, func_data in results[curfile]["funcs"].items():
if func_data["start"] <= loc and (func_data["end"] is None or func_data["end"] > loc):
curfunc = func
break

if count == 0: # uncov lines is just which lines were uncovered
if not curfunc: continue
assert(curfunc), "Uncovered line but no idea what function: {}".format(line)
results[curfile]["uncovered_lines"].append(loc)
results[curfile]["funcs"][curfunc]["uncovlines"].append(loc)



for f, data in results.items():
#print("\n {}".format(f))
for func_name, func_data in data["funcs"].items():
if func_data["execs"] == 0:
#print(f, func_name, func_data["uncovlines"])
print(f, func_name, func_data["uncovlines"])

pickle.dump(results, open("uncovered.pickle","wb"))
5 changes: 4 additions & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
FROM i386/debian:stretch
RUN echo deb http://httpredir.debian.org/debian wheezy-backports main >> /etc/apt/sources.list
RUN echo deb http://archive.debian.org/debian wheezy-backports main >> /etc/apt/sources.list
RUN apt-get update
RUN apt-get install -y sudo build-essential python wget cmake gdb gawk mlocate \
vim libc++-dev g++-multilib g++ ninja-build \
Expand Down Expand Up @@ -76,3 +76,6 @@ RUN update-locale LANG=C.UTF-8

# Having autoconf in the container will make building autotools packages easier
RUN apt-get install -y autoconf libtool m4 automake

# sqlite3 needs tcl
RUN apt-get install -y tcl
80 changes: 80 additions & 0 deletions docs/creating-a-target
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
This is supplimental to docs/how-to-lava.

Goal: Given a c project with some makefile-based setup, you want to add LAVA-bugs

There are two major steps here:
1) Build minimal, preprocessing makefiles
2) Run LAVA

To begin, you want to replace the makefile with something like

```
all: libyaml

CFLAGS += -O0 -m32 -DHAVE_CONFIG_H -I. -I.. -I../include -g -gdwarf-2

LIBOBJ = \
api-pre.o \
reader-pre.o \
scanner-pre.o \
parser-pre.o \
loader-pre.o \
writer-pre.o \
emitter-pre.o \
dumper-pre.o

.SECONDARY:
%-pre.c :
gcc -include stdio.h $(CFLAGS) -E -o $@ $(shell echo "$@" | sed -e "s/-pre//")
sed -i '/^#/ d' $@

%.o : %.c
$(CC) $(CFLAGS) -c -o $@ $<

libyaml : $(LIBOBJ) deconstruct-pre.c
$(CC) $(CFLAGS) -o libyaml $^

preclean :
rm -f *-pre.c
rm -f *-pre.h

clean :
rm -f libyaml \
rm -f *.o \
rm -f *.so \
rm -f *.Tpo
```

To do this, you're going to need to identify the CFLAGS and the code that's being compiled.

Get your project ready to be made (e.g., run ./compile)

Then, in the docker container run:
`~/path-to-lava/tools/btrace/sw-btrace make`, and
`~/path-to-lava/tools/btrace/sw-btrace-to-compiledb /llvm-3.6.2`

When these are finished, you should have a compile_commands.json files describing every compilation was run during make.

Use vim macros, grep, or your favorite text parsing tools to reduce this to the raw list of GCC commands executed in each directory

For each directory, extract the gcc arguments (these will be your CFLAGS).

Now edit the above makefile. Replace libyaml with your target. and update the CFLAGS to what you found.
Update the LIBOBJ list to contain each compiled filename you observed in your compile_commands.json, change `foo.c` into `foo-pre.o`

Replace the libyaml target with the command to make the binary you're interested in.

Now in the docker container, run `make` and test if it works. If things are good, run `make clean` (which will leave behind the preprocessed source). Build a tarball and make a config similar to one in `target_configs/` for your target

Remove any -MT *.lo -MD MP, MF *.Tpo nonsense as well as -fPIC, -dPIC
Replace any -OX flags with -O0
Modify output targets to be in the same directory (e.g., not in .libs)
Delete the -I/llvm-3.6.2

Add `-gdwarf-2 -m32` because these are important

Now these should look like simple gcc commands, compiling a .c file into an .o file

Make your LIBOBJs these filenames

Try building
1 change: 0 additions & 1 deletion docs/how-to-lava.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

# How to get a new target working with LAVA

# Prerequsites
Expand Down
File renamed without changes.
Loading