LODAnalysis

##Requirements

PIG[1]
Hadoop[2]
Python[3]
D2S4PIG[4]: clone project, run mvn clean package to compile, and add it to the PIG classpath

##Analysis ###pipeline:

Fetch datasets (wouter)
hadoop fs -put all datasets
exec runAnalysis.sh -p <hadoop_path>
hadoop fs -get (but merge) hadoop analysis
java -jar ..

###Main Analysis method Run runAnalysis.sh to get more information on how to run all analysis methods ###NameSpace extraction To run, execute pig LODAnalysis/pig/extractNs.py <hadoop_input_file>. Output is stored in path <hadoop_input_file>_analysis/namespaces. For now, this script only counts the namespaces occuring in predicate position. (we can easily change this)

###Schema Information Extracttion To compile, run

ant

in mapReduce directory.

To run the program, run

hadoop jar lib/datasetAnalysisTools.jar jobs.GetSchemaStats [--reducetasks "number of reducers"]

please note that "input dir" and "output dir" are supposed to be on the hadoop filesystem.

##Links

Name		Name	Last commit message	Last commit date
Latest commit History 222 Commits
bin		bin
pig		pig
sparql		sparql
src		src
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LODAnalysis

About

Releases

Packages

Contributors 2

Languages

LOD-Laundromat/LODAnalysis

Folders and files

Latest commit

History

Repository files navigation

LODAnalysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages