Importing tpc-h testdata into mongodb
As written in a former post, tpc-h offers an easy possiblity to generate various amounts of testdata. Download dbgen from this website and compile it: http://www.tpc.org/tpch/
now run
./dbgen -v -s 0.1
this should leave you with some *.tbl files (PIPE separated csvfiles). Now you can use my scripts to convert them into json an import them into mongodb.
i packed already some generated files into the archive and added the header, so you don’t have to generate the tbl-files by yourself. You only have to adjust the load_into_mongodb.sh script so it loads into the correct database (if test is not ok for you).
if you use your own generated tpl files you have to run: create_mongodb_headers.sh
first
tar -xjvvf mongodb_tpch.tar.bz2 cd mongodb_tpch ./convert_to_json.sh ./load_into_mongodb.sh
the default script imports the data into the db “test” and collections named like the tpc-h tables.