Hive Metastore Container Column Width Limit

Question

Hive Metastore Container Column Width Limit

Using AWS EMR version 5.2.1 as a data processing environment, dealing with a huge JSON file that has a complex scheme with many nested fields, Hive cannot handle it and errors as it reaches the current limit of 4000 characters in column lengths.

Error processing statement: FAILED: Runtime error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. InvalidObjectException (message: Invalid column type name: [...]

If you look at the documentation, there are already many problems associated with this problem or similar, although all are unresolved [1 , 2 ]. In this case, it is recommended that you change several Metastore fields to a different value to provide greater length for structure definitions.

COLUMNS_V2.TYPE_NAME
TABLE_PARAMS.PARAM_VALUE
SERDE_PARAMS.PARAM_VALUE
SD_PARAMS.PARAM_VALUE

As indicated in the first issue, the proposed solution mentions:

[...] after setting the values, Metastore must also be configured and restarted. "

However, it is not indicated anywhere that it should also be configured next to the DB values.

Thus, after updating the fields, the current local Metastore (in this case mysql) from string to mediumtext and restarting the Metastore process can still not be reached, since the attempt to load JSON continues to fail with the same error.

Am I missing something or did someone find an alternative way to work around the problem?

+5

json hadoop emr hive

blamblam Feb 01 '17 at 17:42

source share

1 answer

blamblam · Accepted Answer · 2017-02-14T12:17:41+0000

Considering the source code of MetaStore as a question about HIVE-15249 bushes , there is also an additional check that applies in addition to the limitations of the parameters set in the MetaStore tables.

In particular, in the HiveMetaStoreUtils.java file HiveMetaStoreUtils.java following error appears to be the cause of the runtime error:

 public static final int MAX_MS_TYPENAME_LENGTH = 2000; // 4000/2, for an unlikely unicode case

Thus, changing the values for the one mentioned in MetaStore is not enough, because the code restriction will throw an exception.

Hive Metastore Container Column Width Limit

More articles: