Submit a lawsuit from C # and get results

According to the headline, I would like to request a calculation in a Spark cluster (local / HDInsight in Azure) and get the results from a C # application.

I recognized the existence of Livy , which I understand is a REST API application sitting on top of Spark to request it, and I did not find the standard C # API . Is this the right tool for the job? Is it just missing the famous C # API ?

The Spark cluster must gain access to Azure Cosmos DB , so I need to be able to send the job, including the connector jar library (or its path to the cluster driver) so that Spark reads data from Space .

+6
source share
4 answers

How could I not find .NET Spark for querying data, I wrote one

https://github.com/UnoSD/SparkSharp

It's just a quick implementation, but it also has a way to query Cosmos DB using Spark SQL

# Livy, .

using (var client = new HdInsightClient("clusterName", "admin", "password"))
using (var session = await client.CreateSessionAsync(config))
{
    var sum = await session.ExecuteStatementAsync<int>("val res = 1 + 1\nprintln(res)");

    const string sql = "SELECT id, SUM(json.total) AS total FROM cosmos GROUP BY id";

    var cosmos = await session.ExecuteCosmosDbSparkSqlQueryAsync<IEnumerable<Result>>
    (
        "cosmosName",
        "cosmosKey",
        "cosmosDatabase",
        "cosmosCollection",
        "cosmosPreferredRegions",
        sql
    );
}
+4

, SparkSql, #:

https://github.com/Azure-Samples/hdinsight-dotnet-odbc-spark-sql/blob/master/Program.cs

ODBC. :

https://www.microsoft.com/en-us/download/details.aspx?id=49883

: , . :

connectionString = GetDefaultConnectionString();

connectionString = connectionString + "DSN=Sample Microsoft Spark DSN";

DSN ODBC, .

Cosmos, Jupyter Notebook ( ), / # .

, scala/python, #, , LIVY - . , Mobius .

+2

Microsoft .NET DataFrame Apache Spark .NET Foundation OSS. . Http://dot.net/spark http://github.com/dotnet/spark . HDInsight , HDP/Spark ( 3.6 2.3, ).

0

:

"" . , Microsoft . , https://dotnet.microsoft.com/apps/data/spark

    // Create a Spark session
    var spark = SparkSession
    .Builder()
    .AppName("word_count_sample")
    .GetOrCreate();

# !

:

, # , , Spark! , , , , Mobius https://github.com/Microsoft/Mobius.

Spark 4 API : Scala, Java, Python, R. , R API. 3 .

Cosmo DB : https://github.com/Azure/azure-cosmosdb-spark

-3

Source: https://habr.com/ru/post/1680503/


All Articles