How to list all tables in a database using Spark SQL?

I have a SparkSQL connection to an external database:

from pyspark.sql import SparkSession spark = SparkSession \ .builder \ .appName("Python Spark SQL basic example") \ .getOrCreate() 

If I know the name of the table, it is easy to query.

 users_df = spark \ .read.format("jdbc") \ .options(dbtable="users", **db_config) \ .load() 

But is there a good way to list / open tables?

I want the equivalent of SHOW TABLES in mysql or \dt in postgres.

I use pyspark v2.1 if that matters.

+5
source share
1 answer

The answer to this question is not really spark specific. You just need to download information_schema.tables .

An information scheme consists of a set of views containing information about objects defined in the current database. The information schema is defined in the SQL standard and therefore can be assumed to be portable and remain stable - unlike system directories that relate to RDBMS and are modeled after implementation problems.

I will use MySQL for my code snippet that contains the enwiki database on which I want to list the tables:

 # read the information schema table spark.read.format('jdbc'). \ options( url='jdbc:mysql://localhost:3306/', # database url (local, remote) dbtable='information_schema.tables', user='root', password='root', driver='com.mysql.jdbc.Driver'). \ load(). \ filter("table_schema = 'enwiki'"). \ # filter on specific database. show() # +-------------+------------+----------+----------+------+-------+----------+----------+--------------+-----------+---------------+------------+----------+--------------+--------------------+-----------+----------+---------------+--------+--------------+-------------+ # |TABLE_CATALOG|TABLE_SCHEMA|TABLE_NAME|TABLE_TYPE|ENGINE|VERSION|ROW_FORMAT|TABLE_ROWS|AVG_ROW_LENGTH|DATA_LENGTH|MAX_DATA_LENGTH|INDEX_LENGTH| DATA_FREE|AUTO_INCREMENT| CREATE_TIME|UPDATE_TIME|CHECK_TIME|TABLE_COLLATION|CHECKSUM|CREATE_OPTIONS|TABLE_COMMENT| # +-------------+------------+----------+----------+------+-------+----------+----------+--------------+-----------+---------------+------------+----------+--------------+--------------------+-----------+----------+---------------+--------+--------------+-------------+ # | def| enwiki| page|BASE TABLE|InnoDB| 10| Compact| 7155190| 115| 828375040| 0| 975601664|1965031424| 11359093|2017-01-23 08:42:...| null| null| binary| null| | | # +-------------+------------+----------+----------+------+-------+----------+----------+--------------+-----------+---------------+------------+----------+--------------+--------------------+-----------+----------+---------------+--------+--------------+-------------+ 
+8
source

Source: https://habr.com/ru/post/1265659/


All Articles