How to get input file name as column in AWS Athena external tables

I have external tables created in AWS Athena to request S3 data, however the location path contains 1000+ files, so I need the corresponding file name, which will be displayed as a column in the table.

select file_name, col1 from the table where file_name = "test20170516"

In short, I need to know the equivalent of INPUT__FILE__NAME (bush) in AWS Athena Presto or any other ways to achieve the same.

+6
source share
2 answers

You can do this using the $ path pseudo-directory.

select "$path" from table
+13
source

, regeexp_extract().

Athena "$path", - :

SELECT regexp_extract("$path", '[^/]+$') AS filename from table;

, :

SELECT regexp_extract("$path", '[ \w-]+?(?=\.)') AS filename_without_extension from table;

Presto Regular Expression Functions

+5

Source: https://habr.com/ru/post/1017299/


All Articles