This page last modified: Oct 09 2009
title:SQLite from MySQL and AmiGO Gene Ontology keywords:mysql,gene,ontology,go,amigo,sqlite,sqlite3,sql,database,dbi,dbd,perl,postgres,postgresql,install,upgrade,db,rdbms,feature,advanced,export,table,conversion,unix description:A Perl script and instructions to convert MySQL to SQlite using the AmiGO Gene Ontology database as an example. Table of contents ----------------- Introduction Download and run Yet another SQLite, Postgres, MySQL overview Introduction ------------ I loaded geneontology.org's AmiGO database into SQLite. I like the speed, and simplicity of working with SQLite. It works well, is easy to use, is SQL standards compliant, and fast. When I need a big-guns client-server database I use PostgreSQL. The resulting database created below will work with SQLite on Windows or Linux. The supplied commands are Linux. There is an SQLite command line interface, however, the queries against AmiGO tend to be a trifle complex and are better created by a web front ended, Perl scripted CGI back end. Download and run ---------------- 1) Download the MySQL dump SQL statements from geneontology.org. The README is out of date. The full db with gene products is go_YYYYMMDD-seqdblite-data.gz. (They refer to seqdb which no longer exists presumably due to performance issues. They suggest seqdblite. Their old naming convention was go-YYYMMDD-seqdblite.sql. Instead of ".sql", they seem to have changed to "-data.gz".) # This seems to be the most complete download: http://archive.geneontology.org/latest-lite/go_20090823-seqdblite-data.gz # The (dated?) readme: http://archive.geneontology.org/latest-lite/README Download my Perl script that converts MySQL to SQLite. It is based on work I did previously to convert MySQL to PostgreSQL, as well as a shell script you can find at sqlite.org. It may work with any MySQL, but has only been tested with the AmiGO MySQL dump. I provide the script as a .tar for various practical reasons. http://defindit.com/readme_files/mysql2sqlite.tar wget http://archive.geneontology.org/latest-lite/go_20090823-seqdblite-data.gz wget http://defindit.com/readme_files/mysql2sqlite.tar tar -xvf mysql2sqlite.tar chmod +x mysql2sqlite/mysql2sqlite.pl gunzip go_20090823-seqdblite-data.gz cat go_20090823-seqdblite-data.gz | mysql2sqlite/mysql2sqlite.pl > go_20090823-seqdblite-data.sqlite cat go_20090823-seqdblite-data.sqlite | sqlite3 go_seqdblite.db All those commands should run without errors or warnings. My Perl script is a trifle naive, but worked on the Aug 23 2009 GO database. SQLite performance may be fine without indexes, but I won't be surprised if indexes are necessary for more complex queries. Yet another SQLite, Postgres, MySQL overview -------------------------------------------- I wrote this for a non-technical friend. SQLite is a fully transactional SQL relational database management system (RDBMS). SQLite is a single executable and there is no client/server which means that SQLite is simple. Each database is a single file. The database files are portable across all platforms. SQLite is considered an embedded SQL engine. It is so small that it can be embedded inside other applications, or in ROM on devices. For example, SQLite is often used on cell phones. It is fast and standard. Being a single executable and a single database file, administration is nil. Install and run. SQLite is fully transactional, and still as fast as or faster than other SQL databases. SQLite does not use client-server architecture, so in a web application the database must be on the server with the web scripts (although it is best to put the database in a non-web accessible directory). It can easily handle loads that 90% of web sites will experience, but will not scale to large loads. No authentication is necessary or possible, but in a single web server mode this is a non-issue. SQLite is not a daemon, so the security concerns are the same as with any code called from CGI scripts. Oracle, PostgreSQL, and MySQL are all client server RDBMSs. They perform well with really massive numbers of transactions per second. (MySQL is often set up non-transactional; for instance geneontology.or uses the non-transactional MyISAM tables.) Client-server RDBMSs are tricky to install and administer. I have documents to streamline this process for PostgreSQL at: http://defindit.com/readme_files/postgres_utilities.html Cient-server RDBMSs must take into account security issues and authentication. (Postgres and MySQL can be set up without database passwords, but would still need special configuration to be secure.) They are also good when you want authentication. It is possible to hack the database server without necessarily hacking the host server, so the same security precautions must be taken with the database server (daemon) as with any other daemon or service on the server.