

Every special case or option has to be coded, tested, and debugged. Every query has to be coded, tested, and debugged.
#SQLITE TO CSV CODE#
It's easy to forget that even if you use a library to do the CSV parsing, there are coding and maintenance costs to writing special code to query a CSV file. Something like "find all columns where field 5 is greater than 10". Note that if your query is simple parsing and running can happen together.
#SQLITE TO CSV HOW TO#
See here to learn how to load CSV files into sqlite tables.Unless you're doing something very trivial to the CSV, and only doing it once, SQLite will be faster for runtime, coding time, and maintenance time, and it will be more flexible. It’s easy to export sqlite tables or query results to CSV / Parquet files with Python. See here for a Jupyter notebook that contains all the commands covered in this blog post. You need to make sure PyArrow is installed before you can call the to_parquet command. Parquet files are not human readable, but they’re a way better storage format compared to CSV in almost all cases, as explained here. Here’s code that’ll export the trees table to a Parquet file: df = pd.read_sql('SELECT * from trees', conn)ĭf.to_parquet('trees.parquet', index = False) Here’s what the trees_one_command.csv file contains: id,tree_type,has_leaves Here’s what the orders_one_command.csv file contains: id,tree_id,price Notice that the sqlite_master table is queried to get all the tables in the database. Here’s how to export all the sqlite tables to CSV files with a single command: for table in c.execute("SELECT name FROM sqlite_master WHERE type='table' ").fetchall():ĭf = pd.read_sql('SELECT * from ' + t, conn)ĭf.to_csv(t + '_one_command.csv', index = False) This is known as predicate pushdown filtering and can result in a significant performance boost. The SQL query is executed in the database before the data is passed to the DataFrame. Here’s what the orders_over_25.csv file contains: id,tree_id,price df = pd.read_sql_query('SELECT * from orders where price > 25', conn)ĭf.to_csv('orders_over_25.csv', index = False) Here’s how to export all the orders that cost more than $25 to a CSV file. Notice that the CSV file includes the data header row. Here are the contents of the orders.csv file: id,tree_id,price # save sqlite table in a DataFrameĭf = pd.read_sql('SELECT * from orders', conn)

Here’s how to export the orders table to a CSV file. We’re ready to start exporting tables to files now that we have data loaded into a database. Orders_df.to_sql('orders', conn, if_exists='append', index = False) Tree_df.to_sql('trees', conn, if_exists='append', index = False) Orders_df = pd.om_records(orders_data, columns=) Tree_df = pd.om_records(tree_data, columns=) import pandas as pdĬ.execute('''CREATE TABLE trees (id int, tree_type text, has_leaves int)''')Ĭ.execute('''CREATE TABLE orders (id int, tree_id int, price real)''') We’ll use Pandas to create DataFrames that can be loaded into the sqlite database.

The database will contain the following trees and orders tables: We’ll start by creating a sqlite database. It’ll even show how to output all the tables in a sqlite database to files with a single command. It’ll also show how to output SQL queries to CSV files.

This blog post explains how to write sqlite tables to CSV and Parquet files.
