I have a table in MySQL, namely. nas_comps.
select comp_code, count(leg_id) from nas_comps_01012011_31012011 n group by comp_code; comp_code count(leg_id) 'J' 20640 'Y' 39680
First I imported the data to HDFSHadoop version 1.0.2) using Sqoop:
sqoop import --connect jdbc:mysql://172.25.37.135/pros_olap2 \ --username hadoopranch \ --password hadoopranch \ --query "select * from nas_comps where dep_date between '2011-01-01' and '2011-01-10' AND \$CONDITIONS" \ -m 1 \ --target-dir /pros/olap2/dataimports/nas_comps
Then I created an external partitioned Hive table:
create external table nas_comps(DS_NAME string,DEP_DATE string, CRR_CODE string,FLIGHT_NO string,ORGN string, DSTN string,PHYSICAL_CAP int,ADJUSTED_CAP int, CLOSED_CAP int) PARTITIONED BY (LEG_ID int, month INT, COMP_CODE string) location '/pros/olap2/dataimports/nas_comps'
Partition columns are displayed when described:
hive> describe extended nas_comps; OK ds_name string dep_date string crr_code string flight_no string orgn string dstn string physical_cap int adjusted_cap int closed_cap int leg_id int month int comp_code string Detailed Table Information Table(tableName:nas_comps, dbName:pros_olap2_optim, owner:hadoopranch, createTime:1374849456, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:ds_name, type:string, comment:null), FieldSchema(name:dep_date, type:string, comment:null), FieldSchema(name:crr_code, type:string, comment:null), FieldSchema(name:flight_no, type:string, comment:null), FieldSchema(name:orgn, type:string, comment:null), FieldSchema(name:dstn, type:string, comment:null), FieldSchema(name:physical_cap, type:int, comment:null), FieldSchema(name:adjusted_cap, type:int, comment:null), FieldSchema(name:closed_cap, type:int, comment:null), FieldSchema(name:leg_id, type:int, comment:null), FieldSchema(name:month, type:int, comment:null), FieldSchema(name:comp_code, type:string, comment:null)], location:hdfs:
But I'm not sure if partitions are created because:
hive> show partitions nas_comps; OK Time taken: 0.599 seconds select count(1) from nas_comps;
returns 0 records
How to create an external Hive table with dynamic partitions?