Managing big enough data in postgres
Postgres can work quite well for large databases without having to look at NoSQL solutions. But if you find your tables getting larger than the physical memory, you might want to look for ways to optimize your peformance. In my current consulting assignment, I have to deal with a table that is 104 gigabytes on Amazon RDS running Postgres. This table is at the heart of the system and any query that as much as glances at the table sideways gets incredibly slow. To make matters worse, we had to run a migration on that table that changed the data type of one of the columns from bigint to varchar.
That would just not work. The one way to solve such situations is to partition the table. PostgreSQL has a feature called as table inheritance that allows child tables to share the same structure of the parent table and the child tables can participate in queries to the parent table. What is cool about this is that postgres also supports the concept of table spaces that lets multiple tables be in separate directories or mount points. So technically, you can mount more often accessed data on fast SSDs or an EBS volume with higher provisioned IOPS and move older data to slower low cost storage devices. This also has the additional benefit of reducing the size of working set and being able to perform seq scans in memory rather than index scans on disk. The data I was dealing with was largely time series data and time seemed like the obvious way to partition it. The first order of business was to export existing data to CSV files using the \copy command using limit and offset. The table I was dealing with contained close to about a 100 million rows with multiple JSON fields so I exported them a million at a time.
The next step was to create a clone table that had the same structure as the large table.
1 2 create table large_table_clone as select * from large table where 1=2;
The you can create child tables by specifying the check condition and the table you want to inherit from.
1 2 3 4 5 create table large_table_clone_201504 ( check(start_time >= date '2015-04-01' and start_time < date '2015-05-01') ) inherits(large_table_clone)
We can then create a trigger on the parent table to send records to the child table based on the inserted record's start_time. The trigger would look something like this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 CREATE OR REPLACE FUNCTION partition_function() RETURNS TRIGGER AS $$ BEGIN IF ( NEW.start_time <= DATE '2015-04-01' AND NEW.start_time > DATE '2015-05-01' ) THEN INSERT INTO large_table_clone_201504 VALUES (NEW.*); ELSIF ( NEW.start_time <= DATE '2015-03-01' AND NEW.start_time > DATE '2015-04-01' ) THEN INSERT INTO large_table_clone_201503 VALUES (NEW.*); ... ELSE RAISE EXCEPTION 'Date out of range'; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; CREATE TRIGGER large_table_clone_partition BEFORE INSERT ON large_table_clone FOR EACH ROW EXECUTE PROCEDURE partition_function();
The only trouble was I was not sure of the valid date range in my dataset and there was no hope of getting to it from the existing table. So I had to create tables on the fly as data came in.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 create or replace function partition_function() returns trigger as $$ DECLARE table_master varchar := 'large_table_clone'; table_part varchar := ''; start_date date := null; end_date date := null; table_count int; BEGIN table_part := table_master || '_' || to_char(NEW.start_time,'YYYYMM'); start_date := date_trunc('month',NEW.start_time)::date; end_date := (date_trunc('month',NEW.start_time) + '1 month')::date; execute format('select tablename from pg_tables where tablename=%L limit 1',table_part); get diagnostics table_count = ROW_COUNT; if table_count < 1 then execute format( 'create table if not exists %I ( check( start_time <@ [%L,%L]::tsrange) ) inherits (%I)', table_part, start_date, end_date, table_master); end if; execute format( 'insert into %I values (($1).*)', table_part) using NEW; return null; END $$ language plpgsql volatile cost 100;
This dynamically creates tables as fills in the data as they are inserted into large_table_clone. In case you were wondering the COPY command respects the insert trigger so you can bulk load the data exported via CSV files to the new tables.
Since we have table inheritance setup, we can now run queries against the large_table_clone table and it will transparently run the queries across all the child tables and get us the result. But this puts us back where we started. If you really want to get the benefits of partitioning, you will have to use constraint exclusion.
1 2 3 4 set constraint_exclusion=on; select * from large_table_clone where start_time between '2015-01-01' and '2015-02-15';
This will ensure that it only performs the queries on two of the partitions rather than all the partitions. The next obvious step was to create an index on large_table_clone_yyyy_mm partitions on the start_time and cluster the tables using that index. This worked in our cases as the records are usually immutable and older partitions can hence be safely clustered. This made the large table a lot more manageable and we can feel safe archiving old data without impacting the performance of the production system.