Connect to Azure SQL with AAD SP in Spark with Scala
How to example
In order to connect to to an Azure SQL Server using a Service Principal, you need to add the following library to your cluster:
Maven Repository: com.microsoft.azure » adal4j » 1.6.7 (mvnrepository.com)
Maven coordinates: com.microsoft.azure:adal4j:1.6.7 - Version at 2022
And then use the following code to retrieve a token that Spark will use for the authentication.
import com.microsoft.aad.adal4j.AuthenticationContext;
import com.microsoft.aad.adal4j.ClientCredential;
import java.util.concurrent.Executors;
val tenantId = "xxxxx-xxxx-xxxx-xxxx"
val authorityURL= s"https://login.microsoftonline.com/${tenantId}"
val spId = "xxxxx-xxxx-xxxx-xxxx"
val spSecret = "xxxxxxxxx"
// Obtaining Token
val service = Executors.newFixedThreadPool(1);
val context = new AuthenticationContext(authorityURL, true, service);
val credential = new ClientCredential(spId , spSecret );
val result = context.acquireToken("https://database.windows.net/", credential, null).get();
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${databaseName};encrypt=true;trustServerCertificate=true;hostNameInCertificate=*.database.windows.net;loginTimeout=30"
// Reading data
val df= spark.read
.format("jdbc")
.option("url", jdbcUrl )
.option("dbtable", tableName)
.option("accessToken", result.getAccessToken())
.load()
// Writing data
df.write
.format("jdbc")
.mode("append")
.option("url", jdbcUrl)
.option("dbtable", tableName)
.option("accessToken", result.getAccessToken())
.save()
You need to provide:
- “tenantId”: of your azure environment
- “spId”: client id of the service principal with access to the database
- “spSecret”: client secret of the service principal
And the database configuration: jdbcHostname, jdbcPort, databaseName and tableName.